1
|
Miga KH, Eichler EE. Envisioning a new era: Complete genetic information from routine, telomere-to-telomere genomes. Am J Hum Genet 2023; 110:1832-1840. [PMID: 37922882 PMCID: PMC10645551 DOI: 10.1016/j.ajhg.2023.09.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Revised: 09/19/2023] [Accepted: 09/20/2023] [Indexed: 11/07/2023] Open
Abstract
Advances in long-read sequencing and assembly now mean that individual labs can generate phased genomes that are more accurate and more contiguous than the original human reference genome. With declining costs and increasing democratization of technology, we suggest that complete genome assemblies, where both parental haplotypes are phased telomere to telomere, will become standard in human genetics. Soon, even in clinical settings where rigorous sample-handling standards must be met, affected individuals could have reference-grade genomes fully sequenced and assembled in just a few hours given advances in technology, computational processing, and annotation. Complete genetic variant discovery will transform how we map, catalog, and associate variation with human disease and fundamentally change our understanding of the genetic diversity of all humans.
Collapse
Affiliation(s)
- Karen H Miga
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA.
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA; Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA.
| |
Collapse
|
2
|
Altemose N, Glennis A, Bzikadze AV, Sidhwani P, Langley SA, Caldas GV, Hoyt SJ, Uralsky L, Ryabov FD, Shew CJ, Sauria MEG, Borchers M, Gershman A, Mikheenko A, Shepelev VA, Dvorkina T, Kunyavskaya O, Vollger MR, Rhie A, McCartney AM, Asri M, Lorig-Roach R, Shafin K, Aganezov S, Olson D, de Lima LG, Potapova T, Hartley GA, Haukness M, Kerpedjiev P, Gusev F, Tigyi K, Brooks S, Young A, Nurk S, Koren S, Salama SR, Paten B, Rogaev EI, Streets A, Karpen GH, Dernburg AF, Sullivan BA, Straight AF, Wheeler TJ, Gerton JL, Eichler EE, Phillippy AM, Timp W, Dennis MY, O'Neill RJ, Zook JM, Schatz MC, Pevzner PA, Diekhans M, Langley CH, Alexandrov IA, Miga KH. Complete genomic and epigenetic maps of human centromeres. Science 2022; 376:eabl4178. [PMID: 35357911 PMCID: PMC9233505 DOI: 10.1126/science.abl4178] [Citation(s) in RCA: 167] [Impact Index Per Article: 83.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Existing human genome assemblies have almost entirely excluded repetitive sequences within and near centromeres, limiting our understanding of their organization, evolution, and functions, which include facilitating proper chromosome segregation. Now, a complete, telomere-to-telomere human genome assembly (T2T-CHM13) has enabled us to comprehensively characterize pericentromeric and centromeric repeats, which constitute 6.2% of the genome (189.9 megabases). Detailed maps of these regions revealed multimegabase structural rearrangements, including in active centromeric repeat arrays. Analysis of centromere-associated sequences uncovered a strong relationship between the position of the centromere and the evolution of the surrounding DNA through layered repeat expansions. Furthermore, comparisons of chromosome X centromeres across a diverse panel of individuals illuminated high degrees of structural, epigenetic, and sequence variation in these complex and rapidly evolving regions.
Collapse
Affiliation(s)
- Nicolas Altemose
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
| | - A. Glennis
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Andrey V. Bzikadze
- Graduate Program in Bioinformatics and Systems Biology, University of California San Diego, La Jolla, CA, USA
| | - Pragya Sidhwani
- Department of Biochemistry, Stanford University, Stanford, CA, USA
| | - Sasha A. Langley
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
| | - Gina V. Caldas
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
| | - Savannah J. Hoyt
- Institute for Systems Genomics, University of Connecticut, Storrs, CT, USA
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
| | - Lev Uralsky
- Sirius University of Science and Technology, Sochi, Russia
- Vavilov Institute of General Genetics, Moscow, Russia
| | | | - Colin J. Shew
- Genome Center, MIND Institute, and Department of Biochemistry and Molecular Medicine, School of Medicine, University of California, Davis, Davis, CA, USA
| | | | | | - Ariel Gershman
- Department of Molecular Biology and Genetics, Johns Hopkins University, Baltimore, MD, USA
| | - Alla Mikheenko
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University, Saint Petersburg, Russia
| | | | - Tatiana Dvorkina
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University, Saint Petersburg, Russia
| | - Olga Kunyavskaya
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University, Saint Petersburg, Russia
| | - Mitchell R. Vollger
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Arang Rhie
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Ann M. McCartney
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Mobin Asri
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Ryan Lorig-Roach
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Kishwar Shafin
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Sergey Aganezov
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Daniel Olson
- Department of Computer Science, University of Montana, Missoula, MT. USA
| | | | - Tamara Potapova
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | - Gabrielle A. Hartley
- Institute for Systems Genomics, University of Connecticut, Storrs, CT, USA
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
| | - Marina Haukness
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | | | - Fedor Gusev
- Vavilov Institute of General Genetics, Moscow, Russia
| | - Kristof Tigyi
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Shelise Brooks
- NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Alice Young
- NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Sergey Nurk
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Sergey Koren
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Sofie R. Salama
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Benedict Paten
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
- Department of Biomolecular Engineering, University of California Santa Cruz, CA, USA
| | - Evgeny I. Rogaev
- Sirius University of Science and Technology, Sochi, Russia
- Vavilov Institute of General Genetics, Moscow, Russia
- Department of Psychiatry, University of Massachusetts Medical School, Worcester, MA, USA
- Faculty of Biology, Lomonosov Moscow State University, Moscow, Russia
| | - Aaron Streets
- Department of Bioengineering, University of California, Berkeley, Berkeley, CA, USA
- Chan Zuckerberg Biohub, San Francisco, CA, USA
| | - Gary H. Karpen
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
- BioEngineering and BioMedical Sciences Department, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Abby F. Dernburg
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
- Institute for Quantitative Biosciences (QB3), University of California, Berkeley, Berkeley, CA, USA
| | - Beth A. Sullivan
- Department of Molecular Genetics and Microbiology, Duke University School of Medicine, Durham, NC, USA
| | | | - Travis J. Wheeler
- Department of Computer Science, University of Montana, Missoula, MT. USA
| | - Jennifer L. Gerton
- Stowers Institute for Medical Research, Kansas City, MO, USA
- University of Kansas Medical School, Department of Biochemistry and Molecular Biology and Cancer Center, University of Kansas, Kansas City, KS, USA
| | - Evan E. Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Adam M. Phillippy
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Winston Timp
- Department of Molecular Biology and Genetics, Johns Hopkins University, Baltimore, MD, USA
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Megan Y. Dennis
- Genome Center, MIND Institute, and Department of Biochemistry and Molecular Medicine, School of Medicine, University of California, Davis, Davis, CA, USA
| | - Rachel J. O'Neill
- Institute for Systems Genomics, University of Connecticut, Storrs, CT, USA
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
| | - Justin M. Zook
- Biosystems and Biomaterials Division, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Michael C. Schatz
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Pavel A. Pevzner
- Department of Computer Science and Engineering, University of California at San Diego, San Diego, CA, USA
| | - Mark Diekhans
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Charles H. Langley
- Department of Evolution and Ecology, University of California Davis, Davis, CA, USA
| | - Ivan A. Alexandrov
- Vavilov Institute of General Genetics, Moscow, Russia
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University, Saint Petersburg, Russia
- Research Center of Biotechnology of the Russian Academy of Sciences, Moscow, Russia
| | - Karen H. Miga
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
- Department of Biomolecular Engineering, University of California Santa Cruz, CA, USA
| |
Collapse
|
3
|
Abstract
We are entering a new era in genomics where entire centromeric regions are accurately represented in human reference assemblies. Access to these high-resolution maps will enable new surveys of sequence and epigenetic variation in the population and offer new insight into satellite array genomics and centromere function. Here, we focus on the sequence organization and evolution of alpha satellites, which are credited as the genetic and genomic definition of human centromeres due to their interaction with inner kinetochore proteins and their importance in the development of human artificial chromosome assays. We provide an overview of alpha satellite repeat structure and array organization in the context of these high-quality reference data sets; discuss the emergence of variation-based surveys; and provide perspective on the role of this new source of genetic and epigenetic variation in the context of chromosome biology, genome instability, and human disease.
Collapse
Affiliation(s)
- Karen H Miga
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, California 95064, USA; .,Department of Biomolecular Engineering, University of California, Santa Cruz, California 95064, USA
| | - Ivan A Alexandrov
- Department of Genomics and Human Genetics, Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow 119991, Russia; .,Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University, Saint Petersburg 199004, Russia.,Research Center of Biotechnology of the Russian Academy of Sciences, Moscow 119071, Russia
| |
Collapse
|
4
|
Naish M, Alonge M, Wlodzimierz P, Tock AJ, Abramson BW, Schmücker A, Mandáková T, Jamge B, Lambing C, Kuo P, Yelina N, Hartwick N, Colt K, Smith LM, Ton J, Kakutani T, Martienssen RA, Schneeberger K, Lysak MA, Berger F, Bousios A, Michael TP, Schatz MC, Henderson IR. The genetic and epigenetic landscape of the Arabidopsis centromeres. Science 2021; 374:eabi7489. [PMID: 34762468 PMCID: PMC10164409 DOI: 10.1126/science.abi7489] [Citation(s) in RCA: 145] [Impact Index Per Article: 48.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
Centromeres attach chromosomes to spindle microtubules during cell division and, despite this conserved role, show paradoxically rapid evolution and are typified by complex repeats. We used long-read sequencing to generate the Col-CEN Arabidopsis thaliana genome assembly that resolves all five centromeres. The centromeres consist of megabase-scale tandemly repeated satellite arrays, which support CENTROMERE SPECIFIC HISTONE H3 (CENH3) occupancy and are densely DNA methylated, with satellite variants private to each chromosome. CENH3 preferentially occupies satellites that show the least amount of divergence and occur in higher-order repeats. The centromeres are invaded by ATHILA retrotransposons, which disrupt genetic and epigenetic organization. Centromeric crossover recombination is suppressed, yet low levels of meiotic DNA double-strand breaks occur that are regulated by DNA methylation. We propose that Arabidopsis centromeres are evolving through cycles of satellite homogenization and retrotransposon-driven diversification.
Collapse
Affiliation(s)
- Matthew Naish
- Department of Plant Sciences, Downing Street, University of Cambridge, Cambridge CB2 3EA, UK
| | - Michael Alonge
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Piotr Wlodzimierz
- Department of Plant Sciences, Downing Street, University of Cambridge, Cambridge CB2 3EA, UK
| | - Andrew J. Tock
- Department of Plant Sciences, Downing Street, University of Cambridge, Cambridge CB2 3EA, UK
| | - Bradley W. Abramson
- The Plant Molecular and Cellular Biology Laboratory, Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Anna Schmücker
- Gregor Mendel Institute (GMI), Austrian Academy of Sciences, Vienna BioCenter (VBC), Dr. Bohr-Gasse 3, 1030 Vienna, Austria
| | - Terezie Mandáková
- Central European Institute of Technology (CEITEC), Masaryk University, Kamenice 5, Brno 625 00, Czech Republic
| | - Bhagyshree Jamge
- Gregor Mendel Institute (GMI), Austrian Academy of Sciences, Vienna BioCenter (VBC), Dr. Bohr-Gasse 3, 1030 Vienna, Austria
| | - Christophe Lambing
- Department of Plant Sciences, Downing Street, University of Cambridge, Cambridge CB2 3EA, UK
| | - Pallas Kuo
- Department of Plant Sciences, Downing Street, University of Cambridge, Cambridge CB2 3EA, UK
| | - Natasha Yelina
- Department of Plant Sciences, Downing Street, University of Cambridge, Cambridge CB2 3EA, UK
| | - Nolan Hartwick
- The Plant Molecular and Cellular Biology Laboratory, Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Kelly Colt
- The Plant Molecular and Cellular Biology Laboratory, Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Lisa M. Smith
- School of Biosciences and Institute for Sustainable Food, University of Sheffield, Sheffield S10 2TN, UK
| | - Jurriaan Ton
- School of Biosciences and Institute for Sustainable Food, University of Sheffield, Sheffield S10 2TN, UK
| | - Tetsuji Kakutani
- Department of Biological Sciences, University of Tokyo, Tokyo, Japan
| | - Robert A. Martienssen
- Howard Hughes Medical Institute, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Korbinian Schneeberger
- Faculty of Biology, LMU Munich, Großhaderner Str. 2, 82152 Planegg-Martinsried, Germany
- Department of Chromosome Biology, Max Planck Institute for Plant Breeding Research, Carl-von-Linné-Weg 10, 50829 Cologne, Germany
| | - Martin A. Lysak
- Central European Institute of Technology (CEITEC), Masaryk University, Kamenice 5, Brno 625 00, Czech Republic
| | - Frédéric Berger
- Gregor Mendel Institute (GMI), Austrian Academy of Sciences, Vienna BioCenter (VBC), Dr. Bohr-Gasse 3, 1030 Vienna, Austria
| | | | - Todd P. Michael
- The Plant Molecular and Cellular Biology Laboratory, Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Michael C. Schatz
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Ian R. Henderson
- Department of Plant Sciences, Downing Street, University of Cambridge, Cambridge CB2 3EA, UK
| |
Collapse
|
5
|
Suzuki Y, Morishita S. The time is ripe to investigate human centromeres by long-read sequencing†. DNA Res 2021; 28:6381569. [PMID: 34609504 PMCID: PMC8502840 DOI: 10.1093/dnares/dsab021] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Accepted: 09/28/2021] [Indexed: 01/05/2023] Open
Abstract
The complete sequencing of human centromeres, which are filled with highly repetitive elements, has long been challenging. In human centromeres, α-satellite monomers of about 171 bp in length are the basic repeating units, but α-satellite monomers constitute the higher-order repeat (HOR) units, and thousands of copies of highly homologous HOR units form large arrays, which have hampered sequence assembly of human centromeres. Because most HOR unit occurrences are covered by long reads of about 10 kb, the recent availability of much longer reads is expected to enable observation of individual HOR occurrences in terms of their single-nucleotide or structural variants. The time has come to examine the complete sequence of human centromeres.
Collapse
Affiliation(s)
- Yuta Suzuki
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Chiba 277-8568, Japan
| | - Shinichi Morishita
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Chiba 277-8568, Japan
| |
Collapse
|
6
|
Giannuzzi G, Logsdon GA, Chatron N, Miller DE, Reversat J, Munson KM, Hoekzema K, Bonnet-Dupeyron MN, Rollat-Farnier PA, Baker CA, Sanlaville D, Eichler EE, Schluth-Bolard C, Reymond A. Alpha satellite insertion close to an ancestral centromeric region. Mol Biol Evol 2021; 38:5576-5587. [PMID: 34464971 PMCID: PMC8662618 DOI: 10.1093/molbev/msab244] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Human centromeres are mainly composed of alpha satellite DNA hierarchically organized as higher-order repeats (HORs). Alpha satellite dynamics is shown by sequence homogenization in centromeric arrays and by its transfer to other centromeric locations, for example, during the maturation of new centromeres. We identified during prenatal aneuploidy diagnosis by fluorescent in situ hybridization a de novo insertion of alpha satellite DNA from the centromere of chromosome 18 (D18Z1) into cytoband 15q26. Although bound by CENP-B, this locus did not acquire centromeric functionality as demonstrated by the lack of constriction and the absence of CENP-A binding. The insertion was associated with a 2.8-kbp deletion and likely occurred in the paternal germline. The site was enriched in long terminal repeats and located ∼10 Mbp from the location where a centromere was ancestrally seeded and became inactive in the common ancestor of humans and apes 20–25 million years ago. Long-read mapping to the T2T-CHM13 human genome assembly revealed that the insertion derives from a specific region of chromosome 18 centromeric 12-mer HOR array in which the monomer size follows a regular pattern. The rearrangement did not directly disrupt any gene or predicted regulatory element and did not alter the methylation status of the surrounding region, consistent with the absence of phenotypic consequences in the carrier. This case demonstrates a likely rare but new class of structural variation that we name “alpha satellite insertion.” It also expands our knowledge on alphoid DNA dynamics and conveys the possibility that alphoid arrays can relocate near vestigial centromeric sites.
Collapse
Affiliation(s)
- Giuliana Giannuzzi
- Department of Biosciences, University of Milan, Milan, Italy.,Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland.,Institute of Biomedical Technologies, National Research Council, Milan, Italy
| | - Glennis A Logsdon
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Nicolas Chatron
- Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland.,Service de génétique, Hospices Civils de Lyon, Lyon, France.,Institut NeuroMyoGène, University of Lyon, Lyon, France
| | - Danny E Miller
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA.,Department of Pediatrics, Division of Genetic Medicine, University of Washington and Seattle Children's Hospital, Seattle, WA, USA
| | - Julie Reversat
- Service de génétique, Hospices Civils de Lyon, Lyon, France
| | - Katherine M Munson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Kendra Hoekzema
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | | | - Pierre-Antoine Rollat-Farnier
- Service de génétique, Hospices Civils de Lyon, Lyon, France.,Cellule Bioinformatique, Hospices Civils de Lyon, Lyon, France
| | - Carl A Baker
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Damien Sanlaville
- Service de génétique, Hospices Civils de Lyon, Lyon, France.,Institut NeuroMyoGène, University of Lyon, Lyon, France
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA.,Howard Hughes Medical Institute, University of Washington, Seattle, WA, 98195, USA
| | - Caroline Schluth-Bolard
- Service de génétique, Hospices Civils de Lyon, Lyon, France.,Institut NeuroMyoGène, University of Lyon, Lyon, France
| | - Alexandre Reymond
- Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland
| |
Collapse
|
7
|
Suzuki Y, Myers EW, Morishita S. Rapid and ongoing evolution of repetitive sequence structures in human centromeres. SCIENCE ADVANCES 2020; 6:6/50/eabd9230. [PMID: 33310858 PMCID: PMC7732198 DOI: 10.1126/sciadv.abd9230] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/20/2020] [Accepted: 10/30/2020] [Indexed: 06/12/2023]
Abstract
Our understanding of centromere sequence variation across human populations is limited by its extremely long nested repeat structures called higher-order repeats that are challenging to sequence. Here, we analyzed chromosomes 11, 17, and X using long-read sequencing data for 36 individuals from diverse populations including a Han Chinese trio and 21 Japanese. We revealed substantial structural diversity with many previously unidentified variant higher-order repeats specific to individuals characterizing rapid, haplotype-specific evolution of human centromeric arrays, while frequent single-nucleotide variants are largely conserved. We found a characteristic pattern shared among prevalent variants in human and chimpanzee. Our findings pave the way for studying sequence evolution in human and primate centromeres.
Collapse
Affiliation(s)
- Yuta Suzuki
- The University of Tokyo, Graduate School of Frontier Sciences, Department of Computational Biology and Medical Sciences, Kashiwa, Chiba 277-8568, Japan.
| | - Eugene W Myers
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany
| | - Shinichi Morishita
- The University of Tokyo, Graduate School of Frontier Sciences, Department of Computational Biology and Medical Sciences, Kashiwa, Chiba 277-8568, Japan.
| |
Collapse
|
8
|
Miga KH, Koren S, Rhie A, Vollger MR, Gershman A, Bzikadze A, Brooks S, Howe E, Porubsky D, Logsdon GA, Schneider VA, Potapova T, Wood J, Chow W, Armstrong J, Fredrickson J, Pak E, Tigyi K, Kremitzki M, Markovic C, Maduro V, Dutra A, Bouffard GG, Chang AM, Hansen NF, Wilfert AB, Thibaud-Nissen F, Schmitt AD, Belton JM, Selvaraj S, Dennis MY, Soto DC, Sahasrabudhe R, Kaya G, Quick J, Loman NJ, Holmes N, Loose M, Surti U, Risques RA, Graves Lindsay TA, Fulton R, Hall I, Paten B, Howe K, Timp W, Young A, Mullikin JC, Pevzner PA, Gerton JL, Sullivan BA, Eichler EE, Phillippy AM. Telomere-to-telomere assembly of a complete human X chromosome. Nature 2020; 585:79-84. [PMID: 32663838 PMCID: PMC7484160 DOI: 10.1038/s41586-020-2547-7] [Citation(s) in RCA: 390] [Impact Index Per Article: 97.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2019] [Accepted: 05/29/2020] [Indexed: 12/15/2022]
Abstract
After two decades of improvements, the current human reference genome (GRCh38) is the most accurate and complete vertebrate genome ever produced. However, no single chromosome has been finished end to end, and hundreds of unresolved gaps persist1,2. Here we present a human genome assembly that surpasses the continuity of GRCh382, along with a gapless, telomere-to-telomere assembly of a human chromosome. This was enabled by high-coverage, ultra-long-read nanopore sequencing of the complete hydatidiform mole CHM13 genome, combined with complementary technologies for quality improvement and validation. Focusing our efforts on the human X chromosome3, we reconstructed the centromeric satellite DNA array (approximately 3.1 Mb) and closed the 29 remaining gaps in the current reference, including new sequences from the human pseudoautosomal regions and from cancer-testis ampliconic gene families (CT-X and GAGE). These sequences will be integrated into future human reference genome releases. In addition, the complete chromosome X, combined with the ultra-long nanopore data, allowed us to map methylation patterns across complex tandem repeats and satellite arrays. Our results demonstrate that finishing the entire human genome is now within reach, and the data presented here will facilitate ongoing efforts to complete the other human chromosomes.
Collapse
Affiliation(s)
- Karen H Miga
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA.
| | - Sergey Koren
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Arang Rhie
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Mitchell R Vollger
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Ariel Gershman
- Department of Molecular Biology and Genetics, Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Andrey Bzikadze
- Graduate Program in Bioinformatics and Systems Biology, University of California San Diego, San Diego, CA, USA
| | - Shelise Brooks
- NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health, Rockville, MD, USA
| | - Edmund Howe
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | - David Porubsky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Glennis A Logsdon
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Valerie A Schneider
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Tamara Potapova
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | | | | | - Joel Armstrong
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | | | - Evgenia Pak
- Cytogenetic and Microscopy Core, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Kristof Tigyi
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Milinn Kremitzki
- McDonnell Genome Institute at Washington University, St Louis, MO, USA
| | | | - Valerie Maduro
- Undiagnosed Diseases Program, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Amalia Dutra
- Cytogenetic and Microscopy Core, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Gerard G Bouffard
- NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health, Rockville, MD, USA
| | - Alexander M Chang
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Nancy F Hansen
- Comparative Genomics Analysis Unit, Cancer Genetics and Comparative Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Amy B Wilfert
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Françoise Thibaud-Nissen
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | | | | | | | - Megan Y Dennis
- Department of Biochemistry and Molecular Medicine, Genome Center, MIND Institute, University of California Davis, Davis, CA, USA
| | - Daniela C Soto
- Department of Biochemistry and Molecular Medicine, Genome Center, MIND Institute, University of California Davis, Davis, CA, USA
| | - Ruta Sahasrabudhe
- DNA Technologies Core, Genome Center, University of California Davis, Davis, CA, USA
| | - Gulhan Kaya
- Department of Biochemistry and Molecular Medicine, Genome Center, MIND Institute, University of California Davis, Davis, CA, USA
| | - Josh Quick
- Institute of Microbiology and Infection, University of Birmingham, Birmingham, UK
| | - Nicholas J Loman
- Institute of Microbiology and Infection, University of Birmingham, Birmingham, UK
| | - Nadine Holmes
- DeepSeq, School of Life Sciences, University of Nottingham, Nottingham, UK
| | - Matthew Loose
- DeepSeq, School of Life Sciences, University of Nottingham, Nottingham, UK
| | - Urvashi Surti
- Department of Pathology, University of Pittsburgh, Pittsburgh, PA, USA
| | - Rosa Ana Risques
- Department of Pathology, University of Washington, Seattle, WA, USA
| | | | - Robert Fulton
- McDonnell Genome Institute at Washington University, St Louis, MO, USA
| | - Ira Hall
- McDonnell Genome Institute at Washington University, St Louis, MO, USA
| | - Benedict Paten
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | | | - Winston Timp
- Department of Molecular Biology and Genetics, Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Alice Young
- NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health, Rockville, MD, USA
| | - James C Mullikin
- NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health, Rockville, MD, USA
| | - Pavel A Pevzner
- Department of Computer Science and Engineering, University of California San Diego, San Diego, CA, USA
| | | | - Beth A Sullivan
- Department of Molecular Genetics and Microbiology, Division of Human Genetics, Duke University Medical Center, Durham, NC, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - Adam M Phillippy
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA.
| |
Collapse
|
9
|
Sullivan LL, Sullivan BA. Genomic and functional variation of human centromeres. Exp Cell Res 2020; 389:111896. [PMID: 32035947 DOI: 10.1016/j.yexcr.2020.111896] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2019] [Revised: 01/29/2020] [Accepted: 02/05/2020] [Indexed: 10/25/2022]
Abstract
Centromeres are central to chromosome segregation and genome stability, and thus their molecular foundations are important for understanding their function and the ways in which they go awry. Human centromeres typically form at large megabase-sized arrays of alpha satellite DNA for which there is little genomic understanding due to its repetitive nature. Consequently, it has been difficult to achieve genome assemblies at centromeres using traditional next generation sequencing approaches, so that centromeres represent gaps in the current human genome assembly. The role of alpha satellite DNA has been debated since centromeres can form, albeit rarely, on non-alpha satellite DNA. Conversely, the simple presence of alpha satellite DNA is not sufficient for centromere function since chromosomes with multiple alpha satellite arrays only exhibit a single location of centromere assembly. Here, we discuss the organization of human centromeres as well as genomic and functional variation in human centromere location, and current understanding of the genomic and epigenetic mechanisms that underlie centromere flexibility in humans.
Collapse
Affiliation(s)
| | - Beth A Sullivan
- Department of Molecular Genetics and Microbiology, USA; Division of Human Genetics, Duke University School of Medicine, Durham, NC, 27710, USA.
| |
Collapse
|
10
|
Jain M, Olsen HE, Turner DJ, Stoddart D, Bulazel KV, Paten B, Haussler D, Willard HF, Akeson M, Miga KH. Linear assembly of a human centromere on the Y chromosome. Nat Biotechnol 2018; 36:321-323. [PMID: 29553574 PMCID: PMC5886786 DOI: 10.1038/nbt.4109] [Citation(s) in RCA: 167] [Impact Index Per Article: 27.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2017] [Accepted: 02/22/2018] [Indexed: 01/21/2023]
Abstract
The human genome reference sequence remains incomplete owing to the challenge of assembling long tracts of near-identical tandem repeats in centromeres. We implemented a nanopore sequencing strategy to generate high-quality reads that span hundreds of kilobases of highly repetitive DNA in a human Y chromosome centromere. Combining these data with short-read variant validation, we assembled and characterized the centromeric region of a human Y chromosome.
Collapse
Affiliation(s)
- Miten Jain
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, California, USA
| | - Hugh E Olsen
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, California, USA
| | | | | | - Kira V Bulazel
- Duke Institute for Genome Sciences and Policy, Duke University, Durham, North Carolina, USA
| | - Benedict Paten
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, California, USA
| | - David Haussler
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, California, USA
| | - Huntington F Willard
- Duke Institute for Genome Sciences and Policy, Duke University, Durham, North Carolina, USA.,Geisinger National, Bethesda, Maryland, USA
| | - Mark Akeson
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, California, USA
| | - Karen H Miga
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, California, USA.,Duke Institute for Genome Sciences and Policy, Duke University, Durham, North Carolina, USA
| |
Collapse
|
11
|
Popovic I, Marko PB, Wares JP, Hart MW. Selection and demographic history shape the molecular evolution of the gamete compatibility protein bindin in Pisaster sea stars. Ecol Evol 2014; 4:1567-88. [PMID: 24967076 PMCID: PMC4063459 DOI: 10.1002/ece3.1042] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2013] [Revised: 02/15/2014] [Accepted: 02/26/2014] [Indexed: 12/18/2022] Open
Abstract
Reproductive compatibility proteins have been shown to evolve rapidly under positive selection leading to reproductive isolation, despite the potential homogenizing effects of gene flow. This process has been implicated in both primary divergence among conspecific populations and reinforcement during secondary contact; however, these two selective regimes can be difficult to discriminate from each other. Here, we describe the gene that encodes the gamete compatibility protein bindin for three sea star species in the genus Pisaster. First, we compare the full-length bindin-coding sequence among all three species and analyze the evolutionary relationships between the repetitive domains of the variable second bindin exon. The comparison suggests that concerted evolution of repetitive domains has an effect on bindin divergence among species and bindin variation within species. Second, we characterize population variation in the second bindin exon of two species: We show that positive selection acts on bindin variation in Pisaster ochraceus but not in Pisaster brevispinus, which is consistent with higher polyspermy risk in P. ochraceus. Third, we show that there is no significant genetic differentiation among populations and no apparent effect of sympatry with congeners that would suggest selection based on reinforcement. Fourth, we combine bindin and cytochrome c oxidase 1 data in isolation-with-migration models to estimate gene flow parameter values and explore the historical demographic context of our positive selection results. Our findings suggest that positive selection on bindin divergence among P. ochraceus alleles can be accounted for in part by relatively recent northward population expansions that may be coupled with the potential homogenizing effects of concerted evolution.
Collapse
Affiliation(s)
- Iva Popovic
- Department of Biological Sciences, Simon Fraser UniversityBurnaby, British Columbia, Canada
| | - Peter B Marko
- Department of Biology, University of Hawai'iMānoa, Hawaii
| | - John P Wares
- Department of Genetics, University of GeorgiaAthens, Georgia
| | - Michael W Hart
- Department of Biological Sciences, Simon Fraser UniversityBurnaby, British Columbia, Canada
| |
Collapse
|
12
|
Miga KH, Newton Y, Jain M, Altemose N, Willard HF, Kent WJ. Centromere reference models for human chromosomes X and Y satellite arrays. Genome Res 2014; 24:697-707. [PMID: 24501022 PMCID: PMC3975068 DOI: 10.1101/gr.159624.113] [Citation(s) in RCA: 156] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
The human genome sequence remains incomplete, with multimegabase-sized gaps representing the endogenous centromeres and other heterochromatic regions. Available sequence-based studies within these sites in the genome have demonstrated a role in centromere function and chromosome pairing, necessary to ensure proper chromosome segregation during cell division. A common genomic feature of these regions is the enrichment of long arrays of near-identical tandem repeats, known as satellite DNAs, which offer a limited number of variant sites to differentiate individual repeat copies across millions of bases. This substantial sequence homogeneity challenges available assembly strategies and, as a result, centromeric regions are omitted from ongoing genomic studies. To address this problem, we utilize monomer sequence and ordering information obtained from whole-genome shotgun reads to model two haploid human satellite arrays on chromosomes X and Y, resulting in an initial characterization of 3.83 Mb of centromeric DNA within an individual genome. To further expand the utility of each centromeric reference sequence model, we evaluate sites within the arrays for short-read mappability and chromosome specificity. Because satellite DNAs evolve in a concerted manner, we use these centromeric assemblies to assess the extent of sequence variation among 366 individuals from distinct human populations. We thus identify two satellite array variants in both X and Y centromeres, as determined by array length and sequence composition. This study provides an initial sequence characterization of a regional centromere and establishes a foundation to extend genomic characterization to these sites as well as to other repeat-rich regions within complex genomes.
Collapse
Affiliation(s)
- Karen H Miga
- Duke Institute for Genome Sciences & Policy, Duke University, Durham, North Carolina 27708, USA
| | | | | | | | | | | |
Collapse
|
13
|
Sharma A, Wolfgruber TK, Presting GG. Tandem repeats derived from centromeric retrotransposons. BMC Genomics 2013; 14:142. [PMID: 23452340 PMCID: PMC3648361 DOI: 10.1186/1471-2164-14-142] [Citation(s) in RCA: 72] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2012] [Accepted: 02/23/2013] [Indexed: 12/26/2022] Open
Abstract
Background Tandem repeats are ubiquitous and abundant in higher eukaryotic genomes and constitute, along with transposable elements, much of DNA underlying centromeres and other heterochromatic domains. In maize, centromeric satellite repeat (CentC) and centromeric retrotransposons (CR), a class of Ty3/gypsy retrotransposons, are enriched at centromeres. Some satellite repeats have homology to retrotransposons and several mechanisms have been proposed to explain the expansion, contraction as well as homogenization of tandem repeats. However, the origin and evolution of tandem repeat loci remain largely unknown. Results CRM1TR and CRM4TR are novel tandem repeats that we show to be entirely derived from CR elements belonging to two different subfamilies, CRM1 and CRM4. Although these tandem repeats clearly originated in at least two separate events, they are derived from similar regions of their respective parent element, namely the long terminal repeat (LTR) and untranslated region (UTR). The 5′ ends of the monomer repeat units of CRM1TR and CRM4TR map to different locations within their respective LTRs, while their 3′ ends map to the same relative position within a conserved region of their UTRs. Based on the insertion times of heterologous retrotransposons that have inserted into these tandem repeats, amplification of the repeats is estimated to have begun at least ~4 (CRM1TR) and ~1 (CRM4TR) million years ago. Distinct CRM1TR sequence variants occupy the two CRM1TR loci, indicating that there is little or no movement of repeats between loci, even though they are separated by only ~1.4 Mb. Conclusions The discovery of two novel retrotransposon derived tandem repeats supports the conclusions from earlier studies that retrotransposons can give rise to tandem repeats in eukaryotic genomes. Analysis of monomers from two different CRM1TR loci shows that gene conversion is the major cause of sequence variation. We propose that successive intrastrand deletions generated the initial repeat structure, and gene conversions increased the size of each tandem repeat locus.
Collapse
|
14
|
Abstract
Centromeres, the sites of spindle attachment during mitosis and meiosis, are located in specific positions in the human genome, normally coincident with diverse subsets of alpha satellite DNA. While there is strong evidence supporting the association of some subfamilies of alpha satellite with centromere function, the basis for establishing whether a given alpha satellite sequence is or is not designated a functional centromere is unknown, and attempts to understand the role of particular sequence features in establishing centromere identity have been limited by the near identity and repetitive nature of satellite sequences. Utilizing a broadly applicable experimental approach to test sequence competency for centromere specification, we have carried out a genomic and epigenetic functional analysis of endogenous human centromere sequences available in the current human genome assembly. The data support a model in which functionally competent sequences confer an opportunity for centromere specification, integrating genomic and epigenetic signals and promoting the concept of context-dependent centromere inheritance.
Collapse
|
15
|
Abstract
Advances in human genomics have accelerated studies in evolution, disease, and cellular regulation. However, centromere sequences, defining the chromosomal interface with spindle microtubules, remain largely absent from ongoing genomic studies and disconnected from functional, genome-wide analyses. This disparity results from the challenge of predicting the linear order of multi-megabase-sized regions that are composed almost entirely of near-identical satellite DNA. Acknowledging these challenges, the field of human centromere genomics possesses the potential to rapidly advance given the availability of individual, or personalized, genome projects matched with the promise of long-read sequencing technologies. Here I review the current genomic model of human centromeres in consideration of those studies involving functional datasets that examine the role of sequence in centromere identity.
Collapse
|
16
|
Lee HR, Hayden KE, Willard HF. Organization and molecular evolution of CENP-A--associated satellite DNA families in a basal primate genome. Genome Biol Evol 2011; 3:1136-49. [PMID: 21828373 PMCID: PMC3194837 DOI: 10.1093/gbe/evr083] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
Centromeric regions in many complex eukaryotic species contain highly repetitive satellite DNAs. Despite the diversity of centromeric DNA sequences among species, the functional centromeres in all species studied to date are marked by CENP-A, a centromere-specific histone H3 variant. Although it is well established that families of multimeric higher-order alpha satellite are conserved at the centromeres of human and great ape chromosomes and that diverged monomeric alpha satellite is found in old and new world monkey genomes, little is known about the organization, function, and evolution of centromeric sequences in more distant primates, including lemurs. Aye-Aye (Daubentonia madagascariensis) is a basal primate and is located at a key position in the evolutionary tree to study centromeric satellite transitions in primate genomes. Using the approach of chromatin immunoprecipitation with antibodies directed to CENP-A, we have identified two satellite families, Daubentonia madagascariensis Aye-Aye 1 (DMA1) and Daubentonia madagascariensis Aye-Aye 2 (DMA2), related to each other but unrelated in sequence to alpha satellite or any other previously described primate or mammalian satellite DNA families. Here, we describe the initial genomic and phylogenetic organization of DMA1 and DMA2 and present evidence of higher-order repeats in Aye-Aye centromeric domains, providing an opportunity to study the emergence of chromosome-specific modes of satellite DNA evolution in primate genomes.
Collapse
Affiliation(s)
- Hye-Ran Lee
- Genome Biology Group, Duke Institute for Genome Sciences & Policy, Duke University, USA
| | | | | |
Collapse
|
17
|
Paar V, Glunčić M, Basar I, Rosandić M, Paar P, Cvitković M. Large Tandem, Higher Order Repeats and Regularly Dispersed Repeat Units Contribute Substantially to Divergence Between Human and Chimpanzee Y Chromosomes. J Mol Evol 2010; 72:34-55. [DOI: 10.1007/s00239-010-9401-8] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2010] [Accepted: 10/25/2010] [Indexed: 10/18/2022]
|
18
|
The rate of unequal crossing over in the dumpy gene from Drosophila melanogaster. J Mol Evol 2010; 70:260-5. [PMID: 20204610 DOI: 10.1007/s00239-010-9327-1] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2009] [Accepted: 02/10/2010] [Indexed: 10/19/2022]
Abstract
The PIGSFEAST (PF) exon of the Drosophila dumpy gene is undergoing concerted evolution by the process of unequal crossing over. We have developed a long-range PCR-based assay to amplify the approximately 12 kb long exon which contains variable numbers of 303 or 306 nt long repeats in a tandem array. We applied this procedure to mutation accumulation lines of Drosophila melanogaster established by M. Wayne and L. Higgins. Nine new repeat length variants were found in these lines allowing us to measure the rate of unequal crossing over in the PF exon. The rate, which for several reasons is an underestimate, is 7.05 x 10(-4) exchanges per generation.
Collapse
|
19
|
|
20
|
Plohl M, Petrović V, Luchetti A, Ricci A, Satović E, Passamonti M, Mantovani B. Long-term conservation vs high sequence divergence: the case of an extraordinarily old satellite DNA in bivalve mollusks. Heredity (Edinb) 2009; 104:543-51. [PMID: 19844270 DOI: 10.1038/hdy.2009.141] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
The ubiquity of satellite DNA (satDNA) sequences has raised much controversy over the abundance of divergent monomer variants and the long-time nucleotide sequence stability observed for many satDNA families. In this work, we describe the satDNA BIV160, characterized in nine species of the three main bivalve clades (Protobranchia, Pteriomorphia and Heteroconchia). BIV160 monomers are similar in repeat size and nucleotide sequence to satDNAs described earlier in oysters and in the clam Donax trunculus. The broad distribution of BIV160 satDNA indicates that similar variants existed in the ancestral bivalve species that lived about 540 million years ago; this makes BIV160 the most ancient satDNA described so far. In the species examined, monomer variants are distributed in quite a complex pattern. This pattern includes (i) species characterized by a specific group of variants, (ii) species that share distinct group(s) of variants and (iii) species with both specific and shared types. The evolutionary scenario suggested by these data reconciles sequence uniformity in homogenization-maintained satDNA arrays with the genomic richness of divergent monomer variants formed by diversification of the same ancestral satDNA sequence. Diversified repeats can continue to evolve in a non-concerted manner and behave as independent amplification-contraction units in the framework of a 'library of satDNA variants' representing a permanent source of monomers that can be amplified into novel homogeneous satDNA arrays. On the whole, diversification of satDNA monomers and copy number fluctuations provide a highly dynamic genomic environment able to form and displace satDNA sequence variants rapidly in evolution.
Collapse
Affiliation(s)
- M Plohl
- Department of Molecular Biology, Ruder Bosković Institute, Zagreb, Croatia.
| | | | | | | | | | | | | |
Collapse
|
21
|
Guo WJ, Ling J, Li P. Consensus features of microsatellite distribution: Microsatellite contents are universally correlated with recombination rates and are preferentially depressed by centromeres in multicellular eukaryotic genomes. Genomics 2009; 93:323-31. [DOI: 10.1016/j.ygeno.2008.12.009] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2008] [Revised: 12/14/2008] [Accepted: 12/16/2008] [Indexed: 10/21/2022]
|
22
|
Plohl M, Luchetti A, Mestrović N, Mantovani B. Satellite DNAs between selfishness and functionality: structure, genomics and evolution of tandem repeats in centromeric (hetero)chromatin. Gene 2007; 409:72-82. [PMID: 18182173 DOI: 10.1016/j.gene.2007.11.013] [Citation(s) in RCA: 230] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2007] [Revised: 11/08/2007] [Accepted: 11/20/2007] [Indexed: 12/21/2022]
Abstract
Satellite DNAs (tandemly repeated, non-coding DNA sequences) stretch over almost all native centromeres and surrounding pericentromeric heterochromatin. Once considered as inert by-products of genome dynamics in heterochromatic regions, recent studies showed that satellite DNA evolution is interplay of stochastic events and selective pressure. This points to a functional significance of satellite sequences, which in (peri)centromeres may play some fundamental functional roles. First, specific interactions with DNA-binding proteins are proposed to complement sequence-independent epigenetic processes. The second role is achieved through RNAi mechanism, in which transcripts of satellite sequences initialize heterochromatin formation. In addition, satellite DNAs in (peri)centromeric regions affect chromosomal dynamics and genome plasticity. Paradoxically, while centromeric function is conserved through eukaryotes, the profile of satellite DNAs in this region is almost always species-specific. We argue that tandem repeats may be advantageous forms of DNA sequences in (peri)centromeres due to concerted evolution, which maintains high intra-array and intrapopulation sequence homogeneity of satellite arrays, while allowing rapid changes in nucleotide sequence and/or composition of satellite repeats. This feature may be crucial for long-term stability of DNA-protein interactions in centromeric regions.
Collapse
Affiliation(s)
- Miroslav Plohl
- Department of Molecular Genetics, Ruder Bosković Institute, Bijenicka 54, HR-10002 Zagreb, Croatia.
| | | | | | | |
Collapse
|
23
|
Luchetti A, Scanabissi F, Mantovani B. Evolution of LEP150 sub-repeat array within the ribosomal IGS of the clam shrimp Leptestheria dahalacensis (Crustacea Branchiopoda Conchostraca). Gene 2007; 400:174-80. [PMID: 17651923 DOI: 10.1016/j.gene.2007.06.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2007] [Revised: 06/14/2007] [Accepted: 06/19/2007] [Indexed: 11/28/2022]
Abstract
Leptestheria dahalacensis genome harbours repeats of the LEP150 satellite DNA family linked to 5S gene, within the ribosomal intergenic spacer. In genetically isolated samples, the sequence analysis of the region (5S, flanking region, first satellite monomer: unit A, second satellite monomer: unit B) evidenced three 5S variants. The alpha and gamma variants share a greater homology. They co-occur in the Central European samples, while in the Italian one, the highly divergent alpha and beta variants are present. In phylogenetic analyses, A and B LEP150 monomers show a peculiar clustering; this was further confirmed through the sequencing for the alpha variant of four monomers at the 5' and 3' tails (units A, B, C, D and D', C', B', A', respectively). Horizontal homogenisation was observed only across C, D, C' and D' units. Furthermore, repeat sequence diversity decrease toward terminal repeats, at variance of literature data. The pattern of variation observed is explained taking into account the presence at the LEP150 array borders of two loci under natural selection: the 5S rRNA gene, upstream, and the rDNA transcription promoter, downstream. These elements may drive the dynamics of flanking regions and linked repeats in a process similar to selective sweep. At variance of classical genetic hitchhiking, the selective sweep here scored should be realized and maintained through an interplay of selection and molecular drive.
Collapse
Affiliation(s)
- Andrea Luchetti
- Dipartimento di Biologia E. S., Università degli Studi di Bologna, Bologna, Italy.
| | | | | |
Collapse
|
24
|
Mravinac B, Plohl M. Satellite DNA junctions identify the potential origin of new repetitive elements in the beetle Tribolium madens. Gene 2007; 394:45-52. [PMID: 17379457 DOI: 10.1016/j.gene.2007.01.019] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2006] [Revised: 01/24/2007] [Accepted: 01/26/2007] [Indexed: 11/25/2022]
Abstract
Two related satellite DNA families (satellite I and satellite II) with complex higher-order repeat (HOR) monomers represent major DNA components equilocated in the pericentromeric heterochromatin of all Tribolium madens chromosomes. Fragments obtained upon genomic DNA restriction revealed two subfamilies of satellite II monomers, and also identified regions of transition between satellite I and satellite II sequences. The two subfamilies differ not only in diagnostic nucleotides, but also in flipped orientation of constituent subunits. Hybrid genomic fragments comprise directly linked satellite I and satellite II monomers that cannot be distinguished from randomly cloned monomers of corresponding families. An exception is the most proximal satellite I monomer in the hybrid fragment named TMADhinf, which shows sequence divergence typical for repeats evolving at array ends, in zones of low homogenization efficiency. This pattern points to the extensive rearrangement processes generating abrupt transitions between satellite arrays combined with array maintenance by unequal crossover. Switching points between adjacent satellites as well as the edges of flipped subunits are localized within a short sequence segment, indicating a preferential site of recombination within satellite subunits. Multiple copies of TMADhinf junction fragment support the hypothesis that sites of evolutionary origin of novel satellite repeat (sub)families can be localized at array ends, in regions of enhanced sequence divergence.
Collapse
Affiliation(s)
- Brankica Mravinac
- Department of Molecular Biology, Ruder Bosković Institute, Bijenicka 54, HR-10002 Zagreb, Croatia
| | | |
Collapse
|
25
|
Carmon A, Wilkin M, Hassan J, Baron M, MacIntyre R. Concerted evolution within the Drosophila dumpy gene. Genetics 2007; 176:309-25. [PMID: 17237523 PMCID: PMC1893059 DOI: 10.1534/genetics.106.060897] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
We have determined by reverse Southern analysis and direct sequence comparisons that most of the dumpy gene has evolved in the dipteran and other insect orders by purifying selection acting on amino acid replacements. One region, however, is evolving rapidly due to unequal crossing over and/or gene conversion. This region, called "PIGSFEAST," or PF, encodes in D. melanogaster 30-47 repeats of 102 amino acids rich in serines, threonines, and prolines. We show that the processes of concerted evolution have been operating on all species of Drosophila examined to date, but that an adjacent region has expanded in Anopheles gambiae, Aedes aegypti, and Tribolium castaneum, while the PF repeats are reduced in size and number. In addition, processes of concerted evolution have radically altered the codon usage patterns in D. melanogaster, D. pseudoobscura, and D. virilis compared with the rest of the dumpy gene. We show also that the dumpy gene is expressed on the inner surface of the micropyle of the mature oocyte and propose that, as in the abalone system, concerted evolution may be involved in adaptive changes affecting Dumpy's possible role in sperm-egg recognition.
Collapse
Affiliation(s)
- Amber Carmon
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14853, USA
| | | | | | | | | |
Collapse
|
26
|
Abstract
Centromeres are the elements of chromosomes that assemble the proteinaceous kinetochore, maintain sister chromatid cohesion, regulate chromosome attachment to the spindle, and direct chromosome movement during cell division. Although the functions of centromeres and the proteins that contribute to their complex structure and function are conserved in eukaryotes, centromeric DNA diverges rapidly. Human centromeres are particularly complicated. Here, we review studies on the organization of homogeneous arrays of chromosome-specific alpha-satellite repeats and evolutionary links among eukaryotic centromeric sequences. We also discuss epigenetic mechanisms of centromere identity that confer structural and functional features of the centromere through DNA-protein interactions and post-translational modifications, producing centromere-specific chromatin signatures. The assembly and organization of human centromeres, the contributions of satellite DNA to centromere identity and diversity, and the mechanism whereby centromeres are distinguished from the rest of the genome reflect ongoing puzzles in chromosome biology.
Collapse
Affiliation(s)
- Mary G Schueler
- Genome Technology Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
| | | |
Collapse
|
27
|
Roizès G. Human centromeric alphoid domains are periodically homogenized so that they vary substantially between homologues. Mechanism and implications for centromere functioning. Nucleic Acids Res 2006; 34:1912-24. [PMID: 16598075 PMCID: PMC1447651 DOI: 10.1093/nar/gkl137] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Sequence analysis of alphoid repeats from human chromosomes 17, 21 and 13 reveals recurrent diagnostic variant nucleotides. Their combinations define haplotypes, with higher order repeats (HORs) containing identical or closely-related haplotypes tandemly arranged into separate domains. The haplotypes found on homologues can be totally different, while HORs remain 99.8% homogeneous both intrachromosomally and between homologues. These results support the hypothesis, never before demonstrated, that unequal crossovers between sister chromatids accumulate to produce homogenization and amplification into tandem alphoid repeats. I propose that the molecular basis of this involves the diagnostic variant nucleotides, which enable pairing between HORs with identical or closely-related haplotypes. Domains are thus periodically renewed to maintain high intrachromosomal and interhomologue homogeneity. The capacity of a domain to form an active centromere is maintained as long as neither retrotransposons nor significant numbers of mutations affect it. In the presented model, a chromosome with an altered centromere can be transiently rescued by forming a neocentromere, until a restored, fully-competent domain is amplified de novo or rehomogenized through the accumulation of unequal crossovers.
Collapse
Affiliation(s)
- Gérard Roizès
- Institut de Génétique Humaine, UPR 1142, CNRS, 141 Rue de la Cardonille, 34396 Montpellier Cedex 5, France.
| |
Collapse
|
28
|
Abstract
Alpha-satellite is a family of tandemly repeated sequences found at all normal human centromeres. In addition to its significance for understanding centromere function, alpha-satellite is also a model for concerted evolution, as alpha-satellite repeats are more similar within a species than between species. There are two types of alpha-satellite in the human genome; while both are made up of approximately 171-bp monomers, they can be distinguished by whether monomers are arranged in extremely homogeneous higher-order, multimeric repeat units or exist as more divergent monomeric alpha-satellite that lacks any multimeric periodicity. In this study, as a model to examine the genomic and evolutionary relationships between these two types, we have focused on the chromosome 17 centromeric region that has reached both higher-order and monomeric alpha-satellite in the human genome assembly. Monomeric and higher-order alpha-satellites on chromosome 17 are phylogenetically distinct, consistent with a model in which higher-order evolved independently of monomeric alpha-satellite. Comparative analysis between human chromosome 17 and the orthologous chimpanzee chromosome indicates that monomeric alpha-satellite is evolving at approximately the same rate as the adjacent non-alpha-satellite DNA. However, higher-order alpha-satellite is less conserved, suggesting different evolutionary rates for the two types of alpha-satellite.
Collapse
Affiliation(s)
- M Katharine Rudd
- Institute for Genome Sciences & Policy, Duke University, Durham, North Carolina 27708, USA
| | | | | |
Collapse
|
29
|
Schueler MG, Dunn JM, Bird CP, Ross MT, Viggiano L, Rocchi M, Willard HF, Green ED. Progressive proximal expansion of the primate X chromosome centromere. Proc Natl Acad Sci U S A 2005; 102:10563-8. [PMID: 16030148 PMCID: PMC1180780 DOI: 10.1073/pnas.0503346102] [Citation(s) in RCA: 72] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Previous studies of the pericentromeric region of the human X chromosome short arm (Xp) revealed an age gradient from ancient DNA that contains expressed genes to recent human-specific DNA at the functional centromere. We analyzed the finished sequence of this human genomic region to investigate its evolutionary history. Phylogenetic analysis of >1,500 alpha-satellite monomers from the region revealed the presence of five physical domains, each containing monomers from a distinct phylogenetic clade. The most distal domain contains long interspersed nucleotide element repeats that were active >35 million years ago, whereas the four proximal domains contain more recently active long interspersed nucleotide element repeats. An out-of-register, unequal recombination (i.e., crossover) detected at the edge of the X chromosome-specific alpha-satellite array (DXZ1) may reflect the most recent of a series of punctuating events during evolution that resulted in a proximal physical expansion of the X centromere. The first 18 kb of this array has 97-99% pairwise identity among all 2-kb repeat units. To perform more detailed evolutionary comparisons, we sequenced the junction between the ancient DNA of Xp and the primate-specific alpha satellite in chimpanzee, gorilla, orangutan, vervet, macaque, and baboon. The striking conservation found in all cases supports the ancestral nature of the alpha satellite at this location. These studies demonstrate that the primate X centromere appears to have evolved through repeated expansion events occurring within the central, active region of centromeric DNA, with the newly added sequences then conferring centromere function.
Collapse
Affiliation(s)
- Mary G Schueler
- Genome Technology Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | | | | | | | | | | | | | | |
Collapse
|
30
|
Ekes C, Csonka E, Hadlaczky G, Cserpán I. Isolation, cloning and characterization of two major satellite DNA families of rabbit (Oryctolagus cuniculus). Gene 2005; 343:271-9. [PMID: 15588582 DOI: 10.1016/j.gene.2004.09.029] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2004] [Revised: 07/13/2004] [Accepted: 09/23/2004] [Indexed: 11/19/2022]
Abstract
We report here the isolation, cloning and characterization of two abundant centromeric satellite sequences (Rsat I and Rsat II) what are not related to each other, and that of a divergent subfamily (Rsat IIE) of rabbit (Oryctolagus cuniculus). The Rsat I monomers had a 375 base pair (bp) average length, while repeat units Rsat II and Rsat IIE were approximately 585 bp long. Variable amounts of Rsat I were detected by FISH at the centromeric region of 11 chromosome pairs of the complement. Rsat II hybridized to the centromere of 12 different chromosomes, and two of these were labeled also with the Rsat IIE probe. Two-color in situ hybridizations with the satellite probes and rDNA revealed that the NOR chromosomes carried different satellites. Rsat I was abundant on chromosome 20 and 21, but it was undetectable on chromosomes 13 and 16. Large Rsat II arrays were found on chromosomes 16, 20 and 21, but reduced amount was detected on chromosome 13. The variant Rsat IIE was prominent on chromosome 16, but was absent from the other rDNA-bearing chromosomes. The rDNA signal on chromosome 21 was localized to the 21q(ter) region, what can be a useful cytological marker in comparative cytological studies. These data show that rabbit chromosomes form at least four distinct groups based on the satellite composition of their centromeres. The differences in the chromosomal distribution of satellite families will help easy FISH identification of individual chromosomes, as well as to unveil the evolutionary history of the Leporidae karyotype.
Collapse
Affiliation(s)
- Csaba Ekes
- Institute of Genetics, Biological Research Center of the Hungarian Academy of Sciences, H-6701 Szeged, Temesvári krt. 62., P.O. Box 521, Hungary
| | | | | | | |
Collapse
|
31
|
Luchetti A, Marino A, Scanabissi F, Mantovani B. Genomic dynamics of a low-copy-number satellite DNA family in Leptestheria dahalacensis (Crustacea, Branchiopoda, Conchostraca). Gene 2004; 342:313-20. [PMID: 15527990 DOI: 10.1016/j.gene.2004.08.018] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2004] [Revised: 08/06/2004] [Accepted: 08/19/2004] [Indexed: 10/26/2022]
Abstract
The LEP150 satellite DNA (satDNA) family found in Leptestheria dahalacensis (Ruppel, 1837) (Conchostraca) is a low-copy-number satellite with a canonical monomer of 150 bp. Nucleotide variation analyses suggest a 14-bp palindromic region as a possible protein binding site with constraints acting on the whole sequence but a 25-bp variable box. Besides the head-to-tail arrangement of 150 bp monomers, multimers analyses evidenced incomplete monomers, one duplication event, and three inversions. Both observed rearrangements and the higher values of sequence variability scored suggest that rearranged monomers reside in regions with a lower degree of homogenisation efficiency. Sixty-seven percent of the breakpoints occurs at kinkable dinucleotides, thus supporting their role in rearrangements as documented in alphoid satDNA recombination events. Monomers of different lengths may result from crossing over between repeats misaligned through the direct and inverted subrepeats of LEP150 monomers. ANOVA results indicate that the same range of sequence diversity is experienced at the individual and population ranks; therefore, the evolution of the L. dahalacensis satDNA is concerted.
Collapse
Affiliation(s)
- Andrea Luchetti
- Dipartimento di Biologia Evoluzionistica Sperimentale, Università di Bologna, Via Selmi 3, Bologna 40126, Italy
| | | | | | | |
Collapse
|
32
|
Rudd MK, Schueler MG, Willard HF. Sequence organization and functional annotation of human centromeres. COLD SPRING HARBOR SYMPOSIA ON QUANTITATIVE BIOLOGY 2004; 68:141-9. [PMID: 15338612 DOI: 10.1101/sqb.2003.68.141] [Citation(s) in RCA: 38] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Affiliation(s)
- M K Rudd
- Institute for Genome Sciences & Policy, Department of Molecular Genetics and Microbiology, Duke University, Durham, North Carolina 27710, USA
| | | | | |
Collapse
|
33
|
Antson DO, Mendel-Hartvig M, Landegren U, Nilsson M. PCR-generated padlock probes distinguish homologous chromosomes through quantitative fluorescence analysis. Eur J Hum Genet 2003; 11:357-63. [PMID: 12734539 DOI: 10.1038/sj.ejhg.5200966] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022] Open
Abstract
Conventional cytogenetic techniques can distinguish homologous chromosomes in a qualitative manner based upon obvious morphological features or using in situ hybridization methods that yield qualitative data. We have developed a method for quantitative genotyping of single-nucleotide variants in situ using circularizable DNA probes, so-called padlock probes, targeting two different alpha satellite repeat variants present in human chromosome 7 centromeres, and a single-nucleotide variation in alpha satellite repeats on human chromosome 15 centromeres. By using these PCR-generated padlock probes, we could quantitatively distinguish homologous chromosomes and follow the transmission of the chromosomes by in situ analysis during three consecutive generations.
Collapse
Affiliation(s)
- Dan-Oscar Antson
- The Beijer Laboratory, Department of Genetics and Pathology, Rudbeck Laboratory, SE-751 85 Uppsala, Sweden
| | | | | | | |
Collapse
|
34
|
Rosandić M, Paar V, Basar I. Key-string segmentation algorithm and higher-order repeat 16mer (54 copies) in human alpha satellite DNA in chromosome 7. J Theor Biol 2003; 221:29-37. [PMID: 12634041 DOI: 10.1006/jtbi.2003.3165] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
A new key-string segmentation algorithm for identification of alpha satellite DNAs and higher-order repeat (HOR) units was introduced and exemplified. Starting with an initial key string, we determine the dominant key string and HOR. Our key-string algorithm was used to scan the recent GenBank data for human alpha satellite DNA sequence AC017075.8 (193 277 bp) from the centromeric region of chromosome 7. The sequence was computationally segmented into one HOR domain (super-repeat domain) and two non-HOR domains. Dominant key-string GTTTCT provided segmentation in terms of alpha monomers. The HOR is tandemly repeated in 54 copies in the super-repeat (HOR) domain. Five insertions and three deletions in the HOR structure associated with a dominant key string were identified. Concensus HOR was constructed. Divergence of individual HOR copies from concensus amounts to 0.7% on the average, while divergence between 16 monomer variants within each HOR is on the average 20%. In the front and back domain, 199 monomer variants were identified that are not organized in HOR and diverge by 20-40%.
Collapse
Affiliation(s)
- M Rosandić
- Department of Internal Medicine, University Hospital Rebro, University of Zagreb, Kispatićeva 12, Zagreb, Croatia
| | | | | |
Collapse
|
35
|
Abstract
Centromeres are the site for kinetochore formation and spindle attachment and are embedded in heterochromatin in most eukaryotes. The repeat-rich nature of heterochromatin has hindered obtaining a detailed understanding of the composition and organization of heterochromatic and centromeric DNA sequences. Here, we report the results of extensive sequence analysis of a fully functional centromere present in the Drosophila Dp1187 minichromosome. Approximately 8.4% (31 kb) of the highly repeated satellite DNA (AATAT and TTCTC) was sequenced, representing the largest data set of Drosophila satellite DNA sequence to date. Sequence analysis revealed that the orientation of the arrays is uniform and that individual repeats within the arrays mostly differ by rare, single-base polymorphisms. The entire complex DNA component of this centromere (69.7 kb) was sequenced and assembled. The 39-kb "complex island" Maupiti contains long stretches of a complex A+T rich repeat interspersed with transposon fragments, and most of these elements are organized as direct repeats. Surprisingly, five single, intact transposons are directly inserted at different locations in the AATAT satellite arrays. We find no evidence for centromere-specific sequences within this centromere, providing further evidence for sequence-independent, epigenetic determination of centromere identity and function in higher eukaryotes. Our results also demonstrate that the sequence composition and organization of large regions of centric heterochromatin can be determined, despite the presence of repeated DNA.
Collapse
Affiliation(s)
- Xiaoping Sun
- Molecular and Cell Biology Laboratory, The Salk Institute, La Jolla, CA 92037, USA
| | | | | | | |
Collapse
|
36
|
Kouprina N, Ebersole T, Koriabine M, Pak E, Rogozin IB, Katoh M, Oshimura M, Ogi K, Peredelchuk M, Solomon G, Brown W, Barrett JC, Larionov V. Cloning of human centromeres by transformation-associated recombination in yeast and generation of functional human artificial chromosomes. Nucleic Acids Res 2003; 31:922-34. [PMID: 12560488 PMCID: PMC149202 DOI: 10.1093/nar/gkg182] [Citation(s) in RCA: 62] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2002] [Revised: 12/05/2002] [Accepted: 12/05/2002] [Indexed: 11/12/2022] Open
Abstract
Human centromeres remain poorly characterized regions of the human genome despite their importance for the maintenance of chromosomes. In part this is due to the difficulty of cloning of highly repetitive DNA fragments and distinguishing chromosome-specific clones in a genomic library. In this work we report the highly selective isolation of human centromeric DNA using transformation-associated recombination (TAR) cloning. A TAR vector with alphoid DNA monomers as targeting sequences was used to isolate large centromeric regions of human chromosomes 2, 5, 8, 11, 15, 19, 21 and 22 from human cells as well as monochromosomal hybrid cells. The alphoid DNA array was also isolated from the 12 Mb human mini-chromosome DeltaYq74 that contained the minimum amount of alphoid DNA required for proper chromosome segregation. Preliminary results of the structural analyses of different centromeres are reported in this paper. The ability of the cloned human centromeric regions to support human artificial chromosome (HAC) formation was assessed by transfection into human HT1080 cells. Centromeric clones from DeltaYq74 did not support the formation of HACs, indicating that the requirements for the existence of a functional centromere on an endogenous chromosome and those for forming a de novo centromere may be distinct. A construct with an alphoid DNA array from chromosome 22 with no detectable CENP-B motifs formed mitotically stable HACs in the absence of drug selection without detectable acquisition of host DNAs. In summary, our results demonstrated that TAR cloning is a useful tool for investigating human centromere organization and the structural requirements for formation of HAC vectors that might have a potential for therapeutic applications.
Collapse
Affiliation(s)
- N Kouprina
- Laboratory of Biosystems and Cancer, Center for Cancer Research, National Cancer Institute, NIH, Building 37, Room 5032, Bethesda, MD 20892, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
37
|
Schindelhauer D, Schwarz T. Evidence for a fast, intrachromosomal conversion mechanism from mapping of nucleotide variants within a homogeneous alpha-satellite DNA array. Genome Res 2002; 12:1815-26. [PMID: 12466285 PMCID: PMC187568 DOI: 10.1101/gr.451502] [Citation(s) in RCA: 53] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Assuming that patterns of sequence variants within highly homogeneous centromeric tandem repeat arrays can tell us which molecular turnover mechanisms are presently at work, we analyzed the alpha-satellite tandem repeat array DXZ1 of one human X chromosome. Here we present accurate snapshots from this dark matter of the genome. We demonstrate stable and representative cloning of the array in a P1 artificial chromosome (PAC) library, use samples of higher-order repeats subcloned from five unmapped PACs (120-160 kb) to identify common variants, and show that such variants are presently in a fixed transition state. To characterize patterns of variant spread throughout homogeneous array segments, we use a novel partial restriction and pulsed-field gel electrophoresis mapping approach. We find an older large-scale (35-50 kb) duplication event supporting the evolutionarily important unequal crossing-over hypothesis, but generally find independent variant occurrence and a paucity of potential de novo mutations within segments of highest homogeneity (99.1%-99.3%). Within such segments, a highly nonrandom variant clustering within adjacent higher-order repeats was found in the absence of haplotypic repeats. Such variant clusters are hardly explained by interchromosomal, fixation-driving mechanisms and likely reflect a fast, localized, intrachromosomal sequence conversion mechanism.
Collapse
Affiliation(s)
- Dirk Schindelhauer
- Institute of Human Genetics, Technical University of Munich, Munich, Germany.
| | | |
Collapse
|
38
|
Schueler MG, Higgins AW, Rudd MK, Gustashaw K, Willard HF. Genomic and genetic definition of a functional human centromere. Science 2001; 294:109-15. [PMID: 11588252 DOI: 10.1126/science.1065042] [Citation(s) in RCA: 395] [Impact Index Per Article: 17.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
The definition of centromeres of human chromosomes requires a complete genomic understanding of these regions. Toward this end, we report integration of physical mapping, genetic, and functional approaches, together with sequencing of selected regions, to define the centromere of the human X chromosome and to explore the evolution of sequences responsible for chromosome segregation. The transitional region between expressed sequences on the short arm of the X and the chromosome-specific alpha satellite array DXZ1 spans about 450 kilobases and is satellite-rich. At the junction between this satellite region and canonical DXZ1 repeats, diverged repeat units provide direct evidence of unequal crossover as the homogenizing force of these arrays. Results from deletion analysis of mitotically stable chromosome rearrangements and from a human artificial chromosome assay demonstrate that DXZ1 DNA is sufficient for centromere function. Evolutionary studies indicate that, while alpha satellite DNA present throughout the pericentromeric region of the X chromosome appears to be a descendant of an ancestral primate centromere, the current functional centromere based on DXZ1 sequences is the product of the much more recent concerted evolution of this satellite DNA.
Collapse
MESH Headings
- Animals
- Base Sequence
- Cell Line
- Centromere/chemistry
- Centromere/genetics
- Centromere/physiology
- Chromosome Segregation
- Chromosomes, Artificial, Bacterial
- Chromosomes, Artificial, Human
- Computer Simulation
- Contig Mapping
- Crossing Over, Genetic
- DNA, Satellite/chemistry
- DNA, Satellite/genetics
- DNA, Satellite/physiology
- Evolution, Molecular
- Humans
- Interspersed Repetitive Sequences
- Models, Genetic
- Phylogeny
- Repetitive Sequences, Nucleic Acid
- Restriction Mapping
- Sequence Analysis, DNA
- Sequence Deletion
- Sequence Tagged Sites
- Transfection
- Turner Syndrome/genetics
- X Chromosome/genetics
- X Chromosome/physiology
- X Chromosome/ultrastructure
Collapse
Affiliation(s)
- M G Schueler
- Department of Genetics, Case Western Reserve University School of Medicine and Center for Human Genetics, and, Research Institute, University Hospitals of Cleveland, Cleveland, OH 44106, USA
| | | | | | | | | |
Collapse
|
39
|
Bassi C, Magnani I, Sacchi N, Saccone S, Ventura A, Rocchi M, Marozzi A, Ginelli E, Meneveri R. Molecular structure and evolution of DNA sequences located at the alpha satellite boundary of chromosome 20. Gene 2000; 256:43-50. [PMID: 11054534 DOI: 10.1016/s0378-1119(00)00354-1] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
We have isolated and characterised one PAC clone (dJ233C1) containing a linkage between alphoid and non-alphoid DNA. The non-alphoid DNA was found to map at the pericentromeric region of chromosome 20, both on p and q sides, and to contain homologies with one contig (ctg176, Sanger Centre), also located in the same chromosome region. At variance with the chromosome specificity shown by the majority of non-alphoid DNA, a subset of alphoid repeats derived from the PAC yielded FISH hybridisation signals located at the centromeric region of several human chromosomes, belonging to three different suprachromosomal families. The evolutionary conservation of this boundary region was investigated by comparative FISH experiments on chromosomes from great apes. The non-alphoid DNA was found to have undergone events of expansion and transposition to different pericentromeric regions of great apes chromosomes. Alphoid sequences revealed a very wide distribution of FISH signals in the great apes. The pattern was substantially discordant with the data available in the literature, which is essentially derived from the central alphoid subset. These results add further support to the emerging opinion that the pericentromeric regions are high plastics, and that the alpha satellite junctions do not share the evolutionary history with the main subsets.
Collapse
Affiliation(s)
- C Bassi
- Dipartimento di Biologia e Genetica per le Scienze Mediche, Università di Milano, 20133, Milan, Italy
| | | | | | | | | | | | | | | | | |
Collapse
|
40
|
O'Keefe CL, Matera AG. Alpha satellite DNA variant-specific oligoprobes differing by a single base can distinguish chromosome 15 homologs. Genome Res 2000; 10:1342-50. [PMID: 10984452 DOI: 10.1101/gr.10.9.1342] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
The ability to distinguish homologous chromosomes is a powerful cytogenetic tool. However, traditional techniques can only distinguish extreme physical variants and are highly dependent on sample preparation. We have previously reported oligonucleotide probes, specific for human chromosome 17 alpha satellite DNA sequence variants, that distinguish cytogenetically normal homologous chromosomes by FISH. Here we report the development of similar oligoprobes, differing at a single nucleotide position, that not only distinguish homologous chromosomes 15 but can be used to follow the transmission of a chromosome from parents to their offspring. We also identified a novel array-size polymorphism in another family. The alphoid array of one chromosome is quite small and below the detection threshold for our oligoprobes, although it is detectable by conventional FISH probes. This size polymorphism provides an additional FISH-based method for distinguishing homologs. Most importantly, this work illustrates the potential applicability of the technique to the entire human chromosome complement.
Collapse
Affiliation(s)
- C L O'Keefe
- Department of Genetics, Case Western Reserve University and University Hospitals of Cleveland, Cleveland, Ohio 44106-4955 USA
| | | |
Collapse
|
41
|
Warburton PE, Willard HF. Interhomologue sequence variation of alpha satellite DNA from human chromosome 17: evidence for concerted evolution along haplotypic lineages. J Mol Evol 1995; 41:1006-15. [PMID: 8587099 DOI: 10.1007/bf00173182] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
Alpha satellite DNA is a family of tandemly repeated DNA found at the centromeres of all primate chromosomes. Different human chromosomes 17 in the population are characterized by distinct alpha satellite haplotypes, distinguished by the presence of variant repeat forms that have precise monomeric deletions. Pair-wise comparisons of sequence diversity between variant repeat units from each haplotype show that they are closely related in sequence. Direct sequencing of PCR-amplified alpha satellite reveals heterogeneous positions between the repeat units on a chromosome as two bands at the same position on a sequencing ladder. No variation was detected in the sequence and location of these heterogeneous positions between chromosomes 17 from the same haplotype, but distinct patterns of variation were detected between chromosomes from different haplotypes. Subsequent sequence analysis of individual repeats from each haplotype confirmed the presence of extensive haplotype-specific sequence variation. Phylogenetic inference yielded a tree that suggests these chromosome 17 repeat units evolve principally along haplotypic lineages. These studies allow insight into the relative rates and/or timing of genetic turnover processes that lead to the homogenization of tandem DNA families.
Collapse
Affiliation(s)
- P E Warburton
- Department of Genetics, Stanford University, CA 94305, USA
| | | |
Collapse
|
42
|
Abstract
The discoveries, advancements and continuing controversies in the field of molecular evolution are reviewed. Topics summarized are (1) the evolution of the genetic code, (2) gene evolution including the demonstration of homology, estimation of sequence divergence, phylogenetic trees, the molecular clock and the origin of genes and gene families by various genetic mechanisms, and (3) eukaryotic genome evolution, including the highly repeated satellite sequences, the interspersed and potentially mobile repeated sequences and the unique sequence fraction of the genome.
Collapse
Affiliation(s)
- R J MacIntyre
- Section of Genetics and Development, Cornell University, Ithaca, NY 14853-2703
| |
Collapse
|
43
|
Lee C, Ritchie DB, Lin CC. A tandemly repetitive, centromeric DNA sequence from the Canadian woodland caribou (Rangifer tarandus caribou): its conservation and evolution in several deer species. Chromosome Res 1994; 2:293-306. [PMID: 7921645 DOI: 10.1007/bf01552723] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
A highly repetitive DNA clone, designated Rt-Pst3, was isolated from the PstI digest of Canadian woodland caribou (Rangifer tarandus caribou; 2n = 70) genomic DNA. It was found to be a 991 bp monomer of a tandemly repeated DNA sequence comprising about 5.7% of the genome and localized to the centromeric regions of all caribou acrocentric autosomes. Southern blot analyses revealed that this caribou satellite DNA sequence was well conserved in the genomes of five other deer species studied. In situ hybridization studies revealed Rt-Pst3-homologous DNA sequences in the centromeric regions of white-tailed deer chromosomes and Asian muntjac chromosomes, as well as at several interstitial chromosome regions in Indian muntjac chromosomes. Comparisons of the Rt-Pst3 DNA sequence to previously identified centromeric satellite DNA fragments from three other deer species revealed considerable DNA sequence similarity. The first ca. 800 bp of the Rt-Pst3 clone was found to share 73.8% similarity to the CCsatI clone of the European roe deer, 64.7% sequence similarity to the C5 DNA clone of the Chinese muntjac, and 64.8% and 65.6% sequence similarity to the 1A and B1 clones of the Indian muntjac, respectively. Moreover, the last 191 bp of the Rt-Pst3 clone was found to share about 60% DNA sequence similarity to the first 191 bp of the same clone. Amplification of one original ca. 800 bp monomer unit, along with the first 191 bp of the following juxtaposed monomer unit could have resulted in the tandemly repeated, 991 bp monomer unit now seen in the caribou genome. It is postulated that the centromeric satellite DNA found in other deer species, having repeat lengths greater than 800 bp, could also have evolved in a similar manner from a more ancestral monomeric unit of ca. 800 bp.
Collapse
Affiliation(s)
- C Lee
- Department of Laboratory Medicine and Pathology, University of Alberta, Edmonton, Canada
| | | | | |
Collapse
|
44
|
Nonrandom localization of recombination events in human alpha satellite repeat unit variants: implications for higher-order structural characteristics within centromeric heterochromatin. Mol Cell Biol 1993. [PMID: 8413251 DOI: 10.1128/mcb.13.10.6520] [Citation(s) in RCA: 37] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Tandemly repeated DNA families appear to undergo concerted evolution, such that repeat units within a species have a higher degree of sequence similarity than repeat units from even closely related species. While intraspecies homogenization of repeat units can be explained satisfactorily by repeated rounds of genetic exchange processes such as unequal crossing over and/or gene conversion, the parameters controlling these processes remain largely unknown. Alpha satellite DNA is a noncoding tandemly repeated DNA family found at the centromeres of all human and primate chromosomes. We have used sequence analysis to investigate the molecular basis of 13 variant alpha satellite repeat units, allowing comparison of multiple independent recombination events in closely related DNA sequences. The distribution of these events within the 171-bp monomer is nonrandom and clusters in a distinct 20- to 25-bp region, suggesting possible effects of primary sequence and/or chromatin structure. The position of these recombination events may be associated with the location within the higher-order repeat unit of the binding site for the centromere-specific protein CENP-B. These studies have implications for the molecular nature of genetic recombination, mechanisms of concerted evolution, and higher-order structure of centromeric heterochromatin.
Collapse
|
45
|
Warburton PE, Waye JS, Willard HF. Nonrandom localization of recombination events in human alpha satellite repeat unit variants: implications for higher-order structural characteristics within centromeric heterochromatin. Mol Cell Biol 1993; 13:6520-9. [PMID: 8413251 PMCID: PMC364711 DOI: 10.1128/mcb.13.10.6520-6529.1993] [Citation(s) in RCA: 39] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open
Abstract
Tandemly repeated DNA families appear to undergo concerted evolution, such that repeat units within a species have a higher degree of sequence similarity than repeat units from even closely related species. While intraspecies homogenization of repeat units can be explained satisfactorily by repeated rounds of genetic exchange processes such as unequal crossing over and/or gene conversion, the parameters controlling these processes remain largely unknown. Alpha satellite DNA is a noncoding tandemly repeated DNA family found at the centromeres of all human and primate chromosomes. We have used sequence analysis to investigate the molecular basis of 13 variant alpha satellite repeat units, allowing comparison of multiple independent recombination events in closely related DNA sequences. The distribution of these events within the 171-bp monomer is nonrandom and clusters in a distinct 20- to 25-bp region, suggesting possible effects of primary sequence and/or chromatin structure. The position of these recombination events may be associated with the location within the higher-order repeat unit of the binding site for the centromere-specific protein CENP-B. These studies have implications for the molecular nature of genetic recombination, mechanisms of concerted evolution, and higher-order structure of centromeric heterochromatin.
Collapse
Affiliation(s)
- P E Warburton
- Department of Genetics, Stanford University, California 94305
| | | | | |
Collapse
|
46
|
Wevrick R, Willard VP, Willard HF. Structure of DNA near long tandem arrays of alpha satellite DNA at the centromere of human chromosome 7. Genomics 1992; 14:912-23. [PMID: 1478672 DOI: 10.1016/s0888-7543(05)80112-0] [Citation(s) in RCA: 42] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
The centromeric regions of human chromosomes contain long tracts of tandemly repeated DNA, of which the most extensively characterized is alpha satellite. In a screen for additional centromeric DNA sequences, four phage clones were obtained which contain alpha satellite as well as other sequences not usually found associated with tandemly repeated alpha satellite DNA, including L1 repetitive elements, an Alu element, and a novel AT-rich repeated sequence. The alpha satellite DNA contained within these clones does not demonstrate the higher-order repeat structure typical of tandemly repeated alpha satellite. Two of the clones contain inversions; instead of the usual head-to-tail arrangement of alpha satellite monomers, the direction of the monomers changes partway through each clone. The presence of both inversions was confirmed in human genomic DNA by polymerase chain reaction amplification of the inverted regions. One phage clone contains a junction between alpha satellite DNA and a novel low-copy repeated sequence. The junction between the two types of DNA is abrupt and the junction sequence is characterized by the presence of runs of A's and T's, yielding an overall base composition of 65% AT with local areas > 80% AT. The AT-rich sequence is found in multiple copies on chromosome 7 and homologous sequences are found in (peri)centromeric locations on other human chromosomes, including chromosomes 1, 2, and 16. As such, the AT-rich sequence adjacent to alpha satellite DNA provides a tool for the further study of the DNA from this region of the chromosome. The phage clones examined are located within the same 3.3-Mb SstII restriction fragment on chromosome 7 as the two previously described alpha satellite arrays, D7Z1 and D7Z2. These new clones demonstrate that centromeric repetitive DNA, at least on chromosome 7, may be more heterogeneous in composition and organization than had previously been thought.
Collapse
Affiliation(s)
- R Wevrick
- Department of Genetics, Stanford University, California 94305
| | | | | |
Collapse
|
47
|
Warburton PE, Willard HF. PCR amplification of tandemly repeated DNA: analysis of intra- and interchromosomal sequence variation and homologous unequal crossing-over in human alpha satellite DNA. Nucleic Acids Res 1992; 20:6033-42. [PMID: 1461735 PMCID: PMC334470 DOI: 10.1093/nar/20.22.6033] [Citation(s) in RCA: 20] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
Tandemly repeated DNA can comprise several percent of total genomic DNA in complex organisms and, in some instances, may play a role in chromosome structure or function. Alpha satellite DNA is the major family of tandemly repeated DNA found at the centromeres of all human and primate chromosomes. Each centromere is characterized by a large contiguous array of up to several thousand kb which can contain several thousand highly homogeneous repeat units. By using a novel application of the polymerase chain reaction (repPCR), we are able to amplify a representative sampling of multiple repetitive units simultaneously, allowing rapid analysis of chromosomal subsets. Direct sequence analysis of repPCR amplified alpha satellite from chromosomes 17 and X reveals positions of sequence heterogeneity as two bands at a single nucleotide position on a sequencing ladder. The use of TdT in the sequencing reactions greatly reduces the background associated with polymerase pauses and stops, allowing visualization of heterogeneous bases found in as little as 10% of the repeat units. Confirmation of these heterogeneous positions was obtained by comparison to the sequence of multiple individual cloned copies obtained both by PCR and non-PCR based methods. PCR amplification of alpha satellite can also reveal multiple repeat units which differ in size. Analysis of repPCR products from chromosome 17 and X allows rapid determination of the molecular basis of these repeat unit length variants, which appear to be a result of unequal crossing-over. The application of repPCR to the study of tandemly repeated DNA should allow in-depth analysis of intra- and interchromosomal variation and unequal crossing-over, thus providing insight into the biology and genetics of these large families of DNA.
Collapse
Affiliation(s)
- P E Warburton
- Department of Genetics, Stanford University, CA 94305
| | | |
Collapse
|
48
|
Fan YS, Sasi R, Lee C, Court D, Lin CC. Mapping of 50 cosmid clones isolated from a flow-sorted human X chromosome library by fluorescence in situ hybridization. Genomics 1992; 14:542-5. [PMID: 1427877 DOI: 10.1016/s0888-7543(05)80264-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
Fifty cosmids have been mapped to metaphase chromosomes by fluorescence in situ hybridization under conditions that suppress signals from repetitive DNA sequences. The cosmid clones were isolated from a flow-sorted human X chromosome library. Thirty-eight of the clones were localized to chromosome X and 12 to autosomes such as chromosomes 3, 7, 8, 14, and 17. Although most of the cosmids mapped to the X chromosome appeared to be scattered along both the short and long arms, 10 cosmids were localized to the centromeric region of the chromosome. Southern blot analysis revealed that only two of these clones hybridized to probe pXBR-1, which detects the DXZ1 locus. In addition, 4 out of 5 cosmids mapped on chromosome 8 also localized on the centromeric region. While localization of X-specific cosmids will facilitate the physical mapping of the human X chromosome, cosmids mapped to the centromeric regions of chromosomes X and 8 should be especially useful for studying the structure and organization of these regions.
Collapse
Affiliation(s)
- Y S Fan
- Department of Laboratory Medicine and Pathology, University of Alberta, Edmonton, Canada
| | | | | | | | | |
Collapse
|
49
|
Looijenga LH, Oosterhuis JW, Smit VT, Wessels JW, Mollevanger P, Devilee P. Alpha satellite DNAs on chromosomes 10 and 12 are both members of the dimeric suprachromosomal subfamily, but display little identity at the nucleotide sequence level. Genomics 1992; 13:1125-32. [PMID: 1505948 DOI: 10.1016/0888-7543(92)90027-p] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
We have investigated the organization and complexity of alpha satellite DNA on chromosomes 10 and 12 by restriction endonuclease mapping, in situ hybridization (ISH), and DNA-sequencing methods. Alpha satellite DNA on both chromosomes displays a basic dimeric organization, revealed as a 6- and an 8-mer higher-order repeat (HOR) unit on chromosome 10 and as an 8-mer HOR on chromosome 12. While these HORs show complete chromosome specificity under high-stringency ISH conditions, they recognize an identical set of chromosomes under lower stringencies. At the nucleotide sequence level, both chromosome 10 HORs are 50% identical to the HOR on chromosome 12 and to all other alpha satellite DNA sequences from the in situ cross-hybridizing chromosomes, with the exception of chromosome 6. An 80% identity between chromosome 6- and chromosome 10-derived alphoid sequences was observed. These data suggest that the alphoid DNA on chromosomes 6 and 10 may represent a distinct subclass of the dimeric subfamily. These sequences are proposed to be present, along with the more typical dimeric alpha satellite sequences, on a number of different human chromosomes.
Collapse
Affiliation(s)
- L H Looijenga
- Laboratory of Experimental Patho-Oncology, Dr. Daniel den Hoed Cancer Center, Rotterdam, The Netherlands
| | | | | | | | | | | |
Collapse
|
50
|
Haaf T, Willard HF. Organization, polymorphism, and molecular cytogenetics of chromosome-specific alpha-satellite DNA from the centromere of chromosome 2. Genomics 1992; 13:122-8. [PMID: 1577477 DOI: 10.1016/0888-7543(92)90211-a] [Citation(s) in RCA: 25] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
The general usefulness of alpha-satellite DNA probes for the molecular, genetic, and cytogenetic analysis of the human genome is enhanced by their being chromosome specific. Here, we describe the isolation and characterization of an alpha-satellite subset specific for human chromosome 2. Three clones, p2-7, p2-8, and p2-11, obtained from an EcoRI-digested lambda phage library from flow-sorted chromosome 2, are specific for the centromere of chromosome 2 by somatic cell hybrid mapping and chromosomal in situ hybridization. Nucleotide sequence analysis identifies the chromosome 2-specific alpha-satellite subset D2Z1 as a member of the suprachromosomal subfamily II, which is based on a characteristic two-monomer repeat. The D2Z1 subset is further organized as a series of diverged 680-bp tetramers, revealed after digestion of genomic DNA with HaeIII, HindIII, HinfI, StuI, and XbaI. Using pulsed-field gel electrophoresis (PFGE), probes p2-7, p2-8, and p2-11 detect polymorphic restriction patterns within the alpha-satellite array. Among 15 different chromosomes 2 (in two two-generation families and one three-generation family), the length of the D2Z1 alpha-satellite array varied between 1050 and 2900 kb (mean = 1850 kb, SD = 550 kb). The inheritance of the chromosome 2 alpha-satellite arrays and their associated polymorphisms was strictly Mendelian.
Collapse
Affiliation(s)
- T Haaf
- Department of Genetics, Stanford University School of Medicine, California 94305
| | | |
Collapse
|