1
|
Chen Y, Zhang T, Xian M, Zhang R, Yang W, Su B, Yang G, Sun L, Xu W, Xu S, Gao H, Xu L, Gao X, Li J. A draft genome of Drung cattle reveals clues to its chromosomal fusion and environmental adaptation. Commun Biol 2022; 5:353. [PMID: 35418663 PMCID: PMC9008013 DOI: 10.1038/s42003-022-03298-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Accepted: 03/21/2022] [Indexed: 12/02/2022] Open
Abstract
Drung cattle (Bos frontalis) have 58 chromosomes, differing from the Bos taurus 2n = 60 karyotype. To date, its origin and evolution history have not been proven conclusively, and the mechanisms of chromosome fusion and environmental adaptation have not been clearly elucidated. Here, we assembled a high integrity and good contiguity genome of Drung cattle with 13.7-fold contig N50 and 4.1-fold scaffold N50 improvements over the recently published Indian mithun assembly, respectively. Speciation time estimation and phylogenetic analysis showed that Drung cattle diverged from Bos taurus into an independent evolutionary clade. Sequence evidence of centromere regions provides clues to the breakpoints in BTA2 and BTA28 centromere satellites. We furthermore integrated a circulation and contraction-related biological process involving 43 evolutionary genes that participated in pathways associated with the evolution of the cardiovascular system. These findings may have important implications for understanding the molecular mechanisms of chromosome fusion, alpine valleys adaptability and cardiovascular function.
Collapse
Affiliation(s)
- Yan Chen
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Science, Chinese Academy of Agricultural Sciences, 100193, Beijing, P.R. China
| | - Tianliu Zhang
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Science, Chinese Academy of Agricultural Sciences, 100193, Beijing, P.R. China
| | - Ming Xian
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Science, Chinese Academy of Agricultural Sciences, 100193, Beijing, P.R. China
| | - Rui Zhang
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Science, Chinese Academy of Agricultural Sciences, 100193, Beijing, P.R. China
| | - Weifei Yang
- 1 Gene Co., Ltd, 310051, Hangzhou, P.R. China
- Annoroad Gene Technology (Beijing) Co., Ltd, 100176, Beijing, P.R. China
| | - Baqi Su
- Drung Cattle Conservation Farm in Jiudang Wood, Drung and Nu Minority Autonomous County, Gongshan, 673500, Kunming, Yunnan, P.R. China
| | - Guoqiang Yang
- Livestock and Poultry Breed Improvement Center, Nujiang Lisu Minority Autonomous Prefecture, 673199, Kunming, Yunnan, P.R. China
| | - Limin Sun
- Yunnan Animal Husbandry Service, 650224, Kunming, Yunnan, P.R. China
| | - Wenkun Xu
- Yunnan Animal Husbandry Service, 650224, Kunming, Yunnan, P.R. China
| | - Shangzhong Xu
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Science, Chinese Academy of Agricultural Sciences, 100193, Beijing, P.R. China
| | - Huijiang Gao
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Science, Chinese Academy of Agricultural Sciences, 100193, Beijing, P.R. China
| | - Lingyang Xu
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Science, Chinese Academy of Agricultural Sciences, 100193, Beijing, P.R. China
| | - Xue Gao
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Science, Chinese Academy of Agricultural Sciences, 100193, Beijing, P.R. China.
| | - Junya Li
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Science, Chinese Academy of Agricultural Sciences, 100193, Beijing, P.R. China.
| |
Collapse
|
2
|
Chen H, Xue J, Zhang Z, Zhang G, Xu X, Li H, Zhang R, Ullah N, Chen L, Amanullah, Zang Z, Lai S, He X, Li W, Guan M, Li J, Chen L, Deng C. High-speed rail model reveals the gene tandem amplification mediated by short repeated sequence in eukaryote. Sci Rep 2022; 12:2289. [PMID: 35145182 PMCID: PMC8831618 DOI: 10.1038/s41598-022-06250-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Accepted: 01/24/2022] [Indexed: 02/08/2023] Open
Abstract
The occurrence of gene duplication/amplification (GDA) provide potential material for adaptive evolution with environmental stress. Several molecular models have been proposed to explain GDA, recombination via short stretches of sequence similarity plays a crucial role. By screening genomes for such events, we propose a “SRS (short repeated sequence) *N + unit + SRS*N” amplified unit under USCE (unequal sister-chromatid exchange) for tandem amplification mediated by SRS with different repeat numbers in eukaryotes. The amplified units identified from 2131 well-organized amplification events that generate multi gene/element copy amplified with subsequent adaptive evolution in the respective species. Genomic data we analyzed showed dynamic changes among related species or subspecies or plants from different ecotypes/strains. This study clarifies the characteristics of variable copy number SRS on both sides of amplified unit under USCE mechanism, to explain well-organized gene tandem amplification under environmental stress mediated by SRS in all eukaryotes.
Collapse
Affiliation(s)
- Haidi Chen
- Jiangsu Key Laboratory for Biodiversity and Biotechnology, College of Life Sciences, Nanjing Normal University, 1 Wenyuan Rd., Nanjing, 210023, China
| | - Jingwen Xue
- Jiangsu Key Laboratory for Biodiversity and Biotechnology, College of Life Sciences, Nanjing Normal University, 1 Wenyuan Rd., Nanjing, 210023, China
| | - Zhenghou Zhang
- The Fourth Affiliated Hospital of China Medical University, Shenyang, 110032, China
| | - Geyu Zhang
- Jiangsu Key Laboratory for Biodiversity and Biotechnology, College of Life Sciences, Nanjing Normal University, 1 Wenyuan Rd., Nanjing, 210023, China
| | - Xinyuan Xu
- Jiangsu Key Laboratory for Biodiversity and Biotechnology, College of Life Sciences, Nanjing Normal University, 1 Wenyuan Rd., Nanjing, 210023, China
| | - He Li
- Jiangsu Key Laboratory for Biodiversity and Biotechnology, College of Life Sciences, Nanjing Normal University, 1 Wenyuan Rd., Nanjing, 210023, China
| | - Ruxue Zhang
- Jiangsu Key Laboratory for Biodiversity and Biotechnology, College of Life Sciences, Nanjing Normal University, 1 Wenyuan Rd., Nanjing, 210023, China
| | - Najeeb Ullah
- Jiangsu Key Laboratory for Biodiversity and Biotechnology, College of Life Sciences, Nanjing Normal University, 1 Wenyuan Rd., Nanjing, 210023, China
| | - Lvxing Chen
- Jiangsu Key Laboratory for Biodiversity and Biotechnology, College of Life Sciences, Nanjing Normal University, 1 Wenyuan Rd., Nanjing, 210023, China
| | - Amanullah
- Jiangsu Key Laboratory for Biodiversity and Biotechnology, College of Life Sciences, Nanjing Normal University, 1 Wenyuan Rd., Nanjing, 210023, China
| | - Zhuqing Zang
- Jiangsu Key Laboratory for Biodiversity and Biotechnology, College of Life Sciences, Nanjing Normal University, 1 Wenyuan Rd., Nanjing, 210023, China
| | - Shanshan Lai
- Jiangsu Key Laboratory for Biodiversity and Biotechnology, College of Life Sciences, Nanjing Normal University, 1 Wenyuan Rd., Nanjing, 210023, China
| | - Ximiao He
- Department of Physiology, School of Basic Medicine, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430030, Hubei, China.,Center for Genomics and Proteomics Research, School of Basic Medicine, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430030, Hubei, China.,Hubei Key Laboratory of Drug Target Research and Pharmacodynamic Evaluation, Huazhong University of Science and Technology, Wuhan, 430030, Hubei, China
| | - Wei Li
- Department of Dermatovenereology, Institutes for Systems Genetics, Rare Disease Center, West China Hospital, Sichuan University, No. 37 Guo Xue Xiang Street, Chengdu, 610041, Sichuan, China
| | - Miao Guan
- Jiangsu Key Laboratory for Biodiversity and Biotechnology, College of Life Sciences, Nanjing Normal University, 1 Wenyuan Rd., Nanjing, 210023, China.
| | - Jingyi Li
- M.D. Department of Dermatology and Venereology, West China Hospital of Sichuan University, No. 37 Guo Xue Lane, Chengdu, 610041, China.
| | - Liangbiao Chen
- Key Laboratory of Exploration and Utilization of Aquatic Genetic Resources (Ministry of Education), Institute of Experimental Pathology, Shanghai Ocean University, Shanghai, 201306, China.
| | - Cheng Deng
- Jiangsu Key Laboratory for Biodiversity and Biotechnology, College of Life Sciences, Nanjing Normal University, 1 Wenyuan Rd., Nanjing, 210023, China.
| |
Collapse
|
3
|
Suzuki Y, Morishita S. The time is ripe to investigate human centromeres by long-read sequencing†. DNA Res 2021; 28:6381569. [PMID: 34609504 PMCID: PMC8502840 DOI: 10.1093/dnares/dsab021] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Accepted: 09/28/2021] [Indexed: 01/05/2023] Open
Abstract
The complete sequencing of human centromeres, which are filled with highly repetitive elements, has long been challenging. In human centromeres, α-satellite monomers of about 171 bp in length are the basic repeating units, but α-satellite monomers constitute the higher-order repeat (HOR) units, and thousands of copies of highly homologous HOR units form large arrays, which have hampered sequence assembly of human centromeres. Because most HOR unit occurrences are covered by long reads of about 10 kb, the recent availability of much longer reads is expected to enable observation of individual HOR occurrences in terms of their single-nucleotide or structural variants. The time has come to examine the complete sequence of human centromeres.
Collapse
Affiliation(s)
- Yuta Suzuki
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Chiba 277-8568, Japan
| | - Shinichi Morishita
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Chiba 277-8568, Japan
| |
Collapse
|
4
|
Abstract
Mutation of the human genome results in three classes of genomic variation: single nucleotide variants; short insertions or deletions; and large structural variants (SVs). Some mutations occur during normal processes, such as meiotic recombination or B cell development, and others result from DNA replication or aberrant repair of breaks in sequence-specific contexts. Regardless of mechanism, mutations are subject to selection, and some hotspots can manifest in disease. Here, we discuss genomic regions prone to mutation, mechanisms contributing to mutation susceptibility, and the processes leading to their accumulation in normal and somatic genomes. With further, more accurate human genome sequencing, additional mutation hotspots, mechanistic details of their formation, and the relevance of hotspots to evolution and disease are likely to be discovered.
Collapse
|
5
|
Sullivan LL, Sullivan BA. Genomic and functional variation of human centromeres. Exp Cell Res 2020; 389:111896. [PMID: 32035947 PMCID: PMC7140587 DOI: 10.1016/j.yexcr.2020.111896] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2019] [Revised: 01/29/2020] [Accepted: 02/05/2020] [Indexed: 10/25/2022]
Abstract
Centromeres are central to chromosome segregation and genome stability, and thus their molecular foundations are important for understanding their function and the ways in which they go awry. Human centromeres typically form at large megabase-sized arrays of alpha satellite DNA for which there is little genomic understanding due to its repetitive nature. Consequently, it has been difficult to achieve genome assemblies at centromeres using traditional next generation sequencing approaches, so that centromeres represent gaps in the current human genome assembly. The role of alpha satellite DNA has been debated since centromeres can form, albeit rarely, on non-alpha satellite DNA. Conversely, the simple presence of alpha satellite DNA is not sufficient for centromere function since chromosomes with multiple alpha satellite arrays only exhibit a single location of centromere assembly. Here, we discuss the organization of human centromeres as well as genomic and functional variation in human centromere location, and current understanding of the genomic and epigenetic mechanisms that underlie centromere flexibility in humans.
Collapse
Affiliation(s)
| | - Beth A Sullivan
- Department of Molecular Genetics and Microbiology, USA; Division of Human Genetics, Duke University School of Medicine, Durham, NC, 27710, USA.
| |
Collapse
|
6
|
Miga KH. Centromeric Satellite DNAs: Hidden Sequence Variation in the Human Population. Genes (Basel) 2019; 10:E352. [PMID: 31072070 PMCID: PMC6562703 DOI: 10.3390/genes10050352] [Citation(s) in RCA: 64] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2019] [Revised: 05/03/2019] [Accepted: 05/03/2019] [Indexed: 12/30/2022] Open
Abstract
The central goal of medical genomics is to understand the inherited basis of sequence variation that underlies human physiology, evolution, and disease. Functional association studies currently ignore millions of bases that span each centromeric region and acrocentric short arm. These regions are enriched in long arrays of tandem repeats, or satellite DNAs, that are known to vary extensively in copy number and repeat structure in the human population. Satellite sequence variation in the human genome is often so large that it is detected cytogenetically, yet due to the lack of a reference assembly and informatics tools to measure this variability, contemporary high-resolution disease association studies are unable to detect causal variants in these regions. Nevertheless, recently uncovered associations between satellite DNA variation and human disease support that these regions present a substantial and biologically important fraction of human sequence variation. Therefore, there is a pressing and unmet need to detect and incorporate this uncharacterized sequence variation into broad studies of human evolution and medical genomics. Here I discuss the current knowledge of satellite DNA variation in the human genome, focusing on centromeric satellites and their potential implications for disease.
Collapse
Affiliation(s)
- Karen H Miga
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, California, CA 95064, USA.
| |
Collapse
|
7
|
McNulty SM, Sullivan BA. Alpha satellite DNA biology: finding function in the recesses of the genome. Chromosome Res 2018; 26:115-138. [PMID: 29974361 DOI: 10.1007/s10577-018-9582-3] [Citation(s) in RCA: 88] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2018] [Accepted: 06/14/2018] [Indexed: 02/05/2023]
Abstract
Repetitive DNA, formerly referred to by the misnomer "junk DNA," comprises a majority of the human genome. One class of this DNA, alpha satellite, comprises up to 10% of the genome. Alpha satellite is enriched at all human centromere regions and is competent for de novo centromere assembly. Because of the highly repetitive nature of alpha satellite, it has been difficult to achieve genome assemblies at centromeres using traditional next-generation sequencing approaches, and thus, centromeres represent gaps in the current human genome assembly. Moreover, alpha satellite DNA is transcribed into repetitive noncoding RNA and contributes to a large portion of the transcriptome. Recent efforts to characterize these transcripts and their function have uncovered pivotal roles for satellite RNA in genome stability, including silencing "selfish" DNA elements and recruiting centromere and kinetochore proteins. This review will describe the genomic and epigenetic features of alpha satellite DNA, discuss recent findings of noncoding transcripts produced from distinct alpha satellite arrays, and address current progress in the functional understanding of this oft-neglected repetitive sequence. We will discuss unique challenges of studying human satellite DNAs and RNAs and point toward new technologies that will continue to advance our understanding of this largely untapped portion of the genome.
Collapse
Affiliation(s)
- Shannon M McNulty
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, NC, 27710, USA
| | - Beth A Sullivan
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, NC, 27710, USA. .,Division of Human Genetics, Duke University Medical Center, Durham, NC, 27710, USA.
| |
Collapse
|
8
|
Abstract
Genomic variation is a source of functional diversity that is typically studied in genic and non-coding regulatory regions. However, the extent of variation within noncoding portions of the human genome, particularly highly repetitive regions, and the functional consequences are not well understood. Satellite DNA, including α satellite DNA found at human centromeres, comprises up to 10% of the genome, but is difficult to study because its repetitive nature hinders contiguous sequence assemblies. We recently described variation within α satellite DNA that affects centromere function. On human chromosome 17 (HSA17), we showed that size and sequence polymorphisms within primary array D17Z1 are associated with chromosome aneuploidy and defective centromere architecture. However, HSA17 can counteract this instability by assembling the centromere at a second, "backup" array lacking variation. Here, we discuss our findings in a broader context of human centromere assembly, and highlight areas of future study to uncover links between genomic and epigenetic features of human centromeres.
Collapse
Affiliation(s)
- Lori L Sullivan
- a Department of Molecular Genetics and Microbiology , Duke University Medical Center , Durham , NC , USA
| | - Kimberline Chew
- a Department of Molecular Genetics and Microbiology , Duke University Medical Center , Durham , NC , USA
| | - Beth A Sullivan
- a Department of Molecular Genetics and Microbiology , Duke University Medical Center , Durham , NC , USA
| |
Collapse
|
9
|
Satović E, Vojvoda Zeljko T, Luchetti A, Mantovani B, Plohl M. Adjacent sequences disclose potential for intra-genomic dispersal of satellite DNA repeats and suggest a complex network with transposable elements. BMC Genomics 2016; 17:997. [PMID: 27919246 PMCID: PMC5139131 DOI: 10.1186/s12864-016-3347-1] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2016] [Accepted: 11/25/2016] [Indexed: 11/14/2022] Open
Abstract
Background Satellite DNA (satDNA) sequences are typically arranged as arrays of tandemly repeated monomers. Due to the similarity among monomers, their organizational pattern and abundance, satDNAs are hardly accessible to structural and functional studies and still represent the most obscure genome component. Although many satDNA arrays of diverse length and even single monomers exist in the genome, surprisingly little is known about transition from satDNAs to other sequences. Studying satDNA monomers at junctions and identifying DNA sequences adjacent to them can help to understand the processes that (re)distribute satDNAs and significance that evolution of these sequence elements might have in creating the genomic landscape. Results We explored sets of randomly selected satDNA-harboring genomic fragments in four mollusc species to examine satDNA transition sites, and the nature of adjacent sequences. All examined junctions are characterized by abrupt transitions from satDNAs to other sequences. Among them, junctions of only one examined satDNA mapped non-randomly (within the palindrome), indicating that well-defined sequence feature is not a necessary prerequisite in the junction formation. In the studied sample, satDNA flanking sequences can be roughly classified into two groups. The first group is composed of anonymous DNA sequences which occasionally include short segments of transposable elements (TEs) as well as segments of other satDNA sequences. In the second group, satDNA repeats and the array flanking sequences are identified as parts of TEs of the Helitron superfamily. There, some array flanking regions hold fragmented satDNA monomers alternating with anonymous sequences of comparable length as missing monomer parts, suggesting a process of sequence reorganization by a mechanism able to excise short monomer parts and replace them with unrelated sequences. Conclusions The observed architecture of satDNA transition sites can be explained as a result of insertion and/or recombination events involving short arrays of satDNA monomers and TEs, in combination with hypothetical transposition-related ability of satDNA monomers to be shuffled independently in the genome. We conclude that satDNAs and TEs can form a complex network of sequences which essentially share the propagation mechanisms and in synergy shape the genome. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-3347-1) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Eva Satović
- Division of Molecular Biology, Ruđer Bošković Institute, Zagreb, Croatia
| | | | - Andrea Luchetti
- Dipartimento di Scienze Biologiche, Geologiche e Ambientali-Università di Bologna, Bologna, Italy
| | - Barbara Mantovani
- Dipartimento di Scienze Biologiche, Geologiche e Ambientali-Università di Bologna, Bologna, Italy
| | - Miroslav Plohl
- Division of Molecular Biology, Ruđer Bošković Institute, Zagreb, Croatia.
| |
Collapse
|
10
|
Aldrup-MacDonald ME, Kuo ME, Sullivan LL, Chew K, Sullivan BA. Genomic variation within alpha satellite DNA influences centromere location on human chromosomes with metastable epialleles. Genome Res 2016; 26:1301-1311. [PMID: 27510565 PMCID: PMC5052062 DOI: 10.1101/gr.206706.116] [Citation(s) in RCA: 71] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2016] [Accepted: 08/08/2016] [Indexed: 01/27/2023]
Abstract
Alpha satellite is a tandemly organized type of repetitive DNA that comprises 5% of the genome and is found at all human centromeres. A defined number of 171-bp monomers are organized into chromosome-specific higher-order repeats (HORs) that are reiterated thousands of times. At least half of all human chromosomes have two or more distinct HOR alpha satellite arrays within their centromere regions. We previously showed that the two alpha satellite arrays of Homo sapiens Chromosome 17 (HSA17), D17Z1 and D17Z1-B, behave as centromeric epialleles, that is, the centromere, defined by chromatin containing the centromeric histone variant CENPA and recruitment of other centromere proteins, can form at either D17Z1 or D17Z1-B. Some individuals in the human population are functional heterozygotes in that D17Z1 is the active centromere on one homolog and D17Z1-B is active on the other. In this study, we aimed to understand the molecular basis for how centromere location is determined on HSA17. Specifically, we focused on D17Z1 genomic variation as a driver of epiallele formation. We found that D17Z1 arrays that are predominantly composed of HOR size and sequence variants were functionally less competent. They either recruited decreased amounts of the centromere-specific histone variant CENPA and the HSA17 was mitotically unstable, or alternatively, the centromere was assembled at D17Z1-B and the HSA17 was stable. Our study demonstrates that genomic variation within highly repetitive, noncoding DNA of human centromere regions has a pronounced impact on genome stability and basic chromosomal function.
Collapse
Affiliation(s)
- Megan E Aldrup-MacDonald
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, North Carolina 27710, USA
| | - Molly E Kuo
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, North Carolina 27710, USA
| | - Lori L Sullivan
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, North Carolina 27710, USA
| | - Kimberline Chew
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, North Carolina 27710, USA
| | - Beth A Sullivan
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, North Carolina 27710, USA; Division of Human Genetics, Duke University Medical Center, Durham, North Carolina 27710, USA
| |
Collapse
|
11
|
Sevim V, Bashir A, Chin CS, Miga KH. Alpha-CENTAURI: assessing novel centromeric repeat sequence variation with long read sequencing. Bioinformatics 2016; 32:1921-1924. [PMID: 27153570 PMCID: PMC4920115 DOI: 10.1093/bioinformatics/btw101] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2015] [Accepted: 02/17/2016] [Indexed: 11/13/2022] Open
Abstract
Motivation: Long arrays of near-identical tandem repeats are a common feature of centromeric and subtelomeric regions in complex genomes. These sequences present a source of repeat structure diversity that is commonly ignored by standard genomic tools. Unlike reads shorter than the underlying repeat structure that rely on indirect inference methods, e.g. assembly, long reads allow direct inference of satellite higher order repeat structure. To automate characterization of local centromeric tandem repeat sequence variation we have designed Alpha-CENTAURI (ALPHA satellite CENTromeric AUtomated Repeat Identification), that takes advantage of Pacific Bioscience long-reads from whole-genome sequencing datasets. By operating on reads prior to assembly, our approach provides a more comprehensive set of repeat-structure variants and is not impacted by rearrangements or sequence underrepresentation due to misassembly. Results: We demonstrate the utility of Alpha-CENTAURI in characterizing repeat structure for alpha satellite containing reads in the hydatidiform mole (CHM1, haploid-like) genome. The pipeline is designed to report local repeat organization summaries for each read, thereby monitoring rearrangements in repeat units, shifts in repeat orientation and sites of array transition into non-satellite DNA, typically defined by transposable element insertion. We validate the method by showing consistency with existing centromere high order repeat references. Alpha-CENTAURI can, in principle, run on any sequence data, offering a method to generate a sequence repeat resolution that could be readily performed using consensus sequences available for other satellite families in genomes without high-quality reference assemblies. Availability and implementation: Documentation and source code for Alpha-CENTAURI are freely available at http://github.com/volkansevim/alpha-CENTAURI. Contact:ali.bashir@mssm.edu Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Volkan Sevim
- Pacific Biosciences, Inc., Menlo Park, CA 94025, USA
| | - Ali Bashir
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1425 Madison Avenue, New York, NY 10029, USA.,Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, 1425 Madison Avenue, New York, NY 10029, USA
| | | | - Karen H Miga
- Center for Biomolecular Science and Engineering, University of California, Santa Cruz, CA 95064, USA
| |
Collapse
|
12
|
Zahn J, Kaplan MH, Fischer S, Dai M, Meng F, Saha AK, Cervantes P, Chan SM, Dube D, Omenn GS, Markovitz DM, Contreras-Galindo R. Expansion of a novel endogenous retrovirus throughout the pericentromeres of modern humans. Genome Biol 2015; 16:74. [PMID: 25886262 PMCID: PMC4425911 DOI: 10.1186/s13059-015-0641-1] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2015] [Accepted: 03/23/2015] [Indexed: 01/20/2023] Open
Abstract
BACKGROUND Approximately 8% of the human genome consists of sequences of retroviral origin, a result of ancestral infections of the germ line over millions of years of evolution. The most recent of these infections is attributed to members of the human endogenous retrovirus type-K (HERV-K) (HML-2) family. We recently reported that a previously undetected, large group of HERV-K (HML-2) proviruses, which are descendants of the ancestral K111 infection, are spread throughout human centromeres. RESULTS Studying the genomes of certain cell lines and the DNA of healthy individuals that seemingly lack K111, we discover new HERV-K (HML-2) members hidden in pericentromeres of several human chromosomes. All are related through a common ancestor, termed K222, which is a virus that infected the germ line approximately 25 million years ago. K222 exists as a single copy in the genomes of baboons and high order primates, but not New World monkeys, suggesting that progenitor K222 infected the primate germ line after the split between New and Old World monkeys. K222 exists in modern humans at multiple loci spread across the pericentromeres of nine chromosomes, indicating it was amplified during the evolution of modern humans. CONCLUSIONS Copying of K222 may have occurred through recombination of the pericentromeres of different chromosomes during human evolution. Evidence of recombination between K111 and K222 suggests that these retroviral sequences have been templates for frequent cross-over events during the process of centromere recombination in humans.
Collapse
Affiliation(s)
- Joseph Zahn
- Department of Internal Medicine, Division of Infectious Diseases and Programs in Immunology, Cancer Biology, and Cellular and Molecular Biology, University of Michigan, Ann Arbor, MI, 48109, USA.
| | - Mark H Kaplan
- Department of Internal Medicine, Division of Infectious Diseases and Programs in Immunology, Cancer Biology, and Cellular and Molecular Biology, University of Michigan, Ann Arbor, MI, 48109, USA.
| | - Sabrina Fischer
- Department of Internal Medicine, Division of Infectious Diseases and Programs in Immunology, Cancer Biology, and Cellular and Molecular Biology, University of Michigan, Ann Arbor, MI, 48109, USA.
| | - Manhong Dai
- Molecular and Behavioral Neuroscience Institute, University of Michigan, Ann Arbor, MI, 48109, USA.
| | - Fan Meng
- Molecular and Behavioral Neuroscience Institute, University of Michigan, Ann Arbor, MI, 48109, USA.
- Department of Psychiatry, University of Michigan, Ann Arbor, MI, 48109, USA.
| | - Anjan Kumar Saha
- Department of Internal Medicine, Division of Infectious Diseases and Programs in Immunology, Cancer Biology, and Cellular and Molecular Biology, University of Michigan, Ann Arbor, MI, 48109, USA.
| | - Patrick Cervantes
- Department of Internal Medicine, Division of Infectious Diseases and Programs in Immunology, Cancer Biology, and Cellular and Molecular Biology, University of Michigan, Ann Arbor, MI, 48109, USA.
| | - Susana M Chan
- Department of Internal Medicine, Division of Infectious Diseases and Programs in Immunology, Cancer Biology, and Cellular and Molecular Biology, University of Michigan, Ann Arbor, MI, 48109, USA.
| | - Derek Dube
- Department of Internal Medicine, Division of Infectious Diseases and Programs in Immunology, Cancer Biology, and Cellular and Molecular Biology, University of Michigan, Ann Arbor, MI, 48109, USA.
| | - Gilbert S Omenn
- Departments of Computational Medicine and Bioinformatics, Internal Medicine, and Human Genetics, and School of Public Health, University of Michigan, Ann Arbor, MI, 48109, USA.
| | - David M Markovitz
- Department of Internal Medicine, Division of Infectious Diseases and Programs in Immunology, Cancer Biology, and Cellular and Molecular Biology, University of Michigan, Ann Arbor, MI, 48109, USA.
- Department of Internal Medicine, Division of Infectious Diseases, University of Michigan, Ann Arbor, MI, 48109-5640, USA.
| | - Rafael Contreras-Galindo
- Department of Internal Medicine, Division of Infectious Diseases and Programs in Immunology, Cancer Biology, and Cellular and Molecular Biology, University of Michigan, Ann Arbor, MI, 48109, USA.
- Department of Internal Medicine, Division of Infectious Diseases, University of Michigan, Ann Arbor, MI, 48109-5640, USA.
| |
Collapse
|
13
|
Scott KC, Sullivan BA. Neocentromeres: a place for everything and everything in its place. Trends Genet 2013; 30:66-74. [PMID: 24342629 DOI: 10.1016/j.tig.2013.11.003] [Citation(s) in RCA: 68] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2013] [Revised: 11/15/2013] [Accepted: 11/19/2013] [Indexed: 01/07/2023]
Abstract
Centromeres are essential for chromosome inheritance and genome stability. Centromeric proteins, including the centromeric histone centromere protein A (CENP-A), define the site of centromeric chromatin and kinetochore assembly. In many organisms, centromeres are located in or near regions of repetitive DNA. However, some atypical centromeres spontaneously form on unique sequences. These neocentromeres, or new centromeres, were first identified in humans, but have since been described in other organisms. Neocentromeres are functionally and structurally similar to endogenous centromeres, but lack the added complication of underlying repetitive sequences. Here, we discuss recent studies in chicken and fungal systems where genomic engineering can promote neocentromere formation. These studies reveal key genomic and epigenetic factors that support de novo centromere formation in eukaryotes.
Collapse
Affiliation(s)
- Kristin C Scott
- Institute for Genome Sciences & Policy, Duke University, DUMC 3382, Durham, NC 27708, USA; Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, NC 27710, USA; Division of Human Genetics, Duke University Medical Center, Durham, NC 27710, USA.
| | - Beth A Sullivan
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, NC 27710, USA; Division of Human Genetics, Duke University Medical Center, Durham, NC 27710, USA.
| |
Collapse
|
14
|
Contreras-Galindo R, Kaplan MH, He S, Contreras-Galindo AC, Gonzalez-Hernandez MJ, Kappes F, Dube D, Chan SM, Robinson D, Meng F, Dai M, Gitlin SD, Chinnaiyan AM, Omenn GS, Markovitz DM. HIV infection reveals widespread expansion of novel centromeric human endogenous retroviruses. Genome Res 2013; 23:1505-13. [PMID: 23657884 PMCID: PMC3759726 DOI: 10.1101/gr.144303.112] [Citation(s) in RCA: 60] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2012] [Accepted: 04/30/2013] [Indexed: 12/17/2022]
Abstract
Human endogenous retroviruses (HERVs) make up 8% of the human genome. The HERV-K (HML-2) family is the most recent group of these viruses to have inserted into the genome, and we have detected the activation of HERV-K (HML-2) proviruses in the blood of patients with HIV-1 infection. We report that HIV-1 infection activates expression of a novel HERV-K (HML-2) provirus, termed K111, present in multiple copies in the centromeres of chromosomes throughout the human genome yet not annotated in the most recent human genome assembly. Infection with HIV-1 or stimulation with the HIV-1 Tat protein leads to the activation of K111 proviruses. K111 is present as a single copy in the genome of the chimpanzee, yet K111 is not found in the genomes of other primates. Remarkably, K111 proviruses appear in the genomes of the extinct Neanderthal and Denisovan, while modern humans have at least 100 K111 proviruses spread across the centromeres of 15 chromosomes. Our studies suggest that the progenitor K111 integrated before the Homo-Pan divergence and expanded in copy number during the evolution of hominins, perhaps by recombination. The expansion of K111 provides sequence evidence suggesting that recombination between the centromeres of various chromosomes took place during the evolution of humans. K111 proviruses show significant sequence variations in each individual centromere, which may serve as markers in future efforts to annotate human centromere sequences. Further, this work is an example of the potential to discover previously unknown genomic sequences through the analysis of nucleic acids found in the blood of patients.
Collapse
Affiliation(s)
- Rafael Contreras-Galindo
- Department of Internal Medicine, and Programs in Immunology, Cancer Biology, and Cellular and Molecular Biology, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Mark H. Kaplan
- Department of Internal Medicine, and Programs in Immunology, Cancer Biology, and Cellular and Molecular Biology, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Shirley He
- Department of Internal Medicine, and Programs in Immunology, Cancer Biology, and Cellular and Molecular Biology, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Angie C. Contreras-Galindo
- Department of Internal Medicine, and Programs in Immunology, Cancer Biology, and Cellular and Molecular Biology, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Marta J. Gonzalez-Hernandez
- Department of Internal Medicine, and Programs in Immunology, Cancer Biology, and Cellular and Molecular Biology, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Ferdinand Kappes
- Institute of Biochemistry and Molecular Biology, Medical School, RWTH Aachen University, 52074 Aachen, Germany
| | - Derek Dube
- Department of Internal Medicine, and Programs in Immunology, Cancer Biology, and Cellular and Molecular Biology, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Susana M. Chan
- Department of Internal Medicine, and Programs in Immunology, Cancer Biology, and Cellular and Molecular Biology, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Dan Robinson
- Michigan Center for Translational Pathology, University of Michigan Medical School, Ann Arbor, Michigan 48109, USA
- Comprehensive Cancer Center, University of Michigan Medical School, Ann Arbor, Michigan 48109, USA
| | - Fan Meng
- Molecular and Behavioral Neuroscience Institute, University of Michigan, Ann Arbor, Michigan 48109, USA
- Department of Psychiatry, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Manhong Dai
- Molecular and Behavioral Neuroscience Institute, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Scott D. Gitlin
- Department of Internal Medicine, and Programs in Immunology, Cancer Biology, and Cellular and Molecular Biology, University of Michigan, Ann Arbor, Michigan 48109, USA
- Comprehensive Cancer Center, University of Michigan Medical School, Ann Arbor, Michigan 48109, USA
- Veteran Affairs Health System, Ann Arbor, Michigan 48105, USA
| | - Arul M. Chinnaiyan
- Michigan Center for Translational Pathology, University of Michigan Medical School, Ann Arbor, Michigan 48109, USA
- Comprehensive Cancer Center, University of Michigan Medical School, Ann Arbor, Michigan 48109, USA
- Howard Hughes Medical Institute
| | - Gilbert S. Omenn
- Departments of Computational Medicine and Bioinformatics, Internal Medicine, and Human Genetics, and School of Public Health, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - David M. Markovitz
- Department of Internal Medicine, and Programs in Immunology, Cancer Biology, and Cellular and Molecular Biology, University of Michigan, Ann Arbor, Michigan 48109, USA
- Comprehensive Cancer Center, University of Michigan Medical School, Ann Arbor, Michigan 48109, USA
| |
Collapse
|
15
|
Meštrović N, Pavlek M, Car A, Castagnone-Sereno P, Abad P, Plohl M. Conserved DNA Motifs, Including the CENP-B Box-like, Are Possible Promoters of Satellite DNA Array Rearrangements in Nematodes. PLoS One 2013; 8:e67328. [PMID: 23826269 PMCID: PMC3694981 DOI: 10.1371/journal.pone.0067328] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2013] [Accepted: 05/17/2013] [Indexed: 12/27/2022] Open
Abstract
Tandemly arrayed non-coding sequences or satellite DNAs (satDNAs) are rapidly evolving segments of eukaryotic genomes, including the centromere, and may raise a genetic barrier that leads to speciation. However, determinants and mechanisms of satDNA sequence dynamics are only partially understood. Sequence analyses of a library of five satDNAs common to the root-knot nematodes Meloidogyne chitwoodi and M. fallax together with a satDNA, which is specific for M. chitwoodi only revealed low sequence identity (32-64%) among them. However, despite sequence differences, two conserved motifs were recovered. One of them turned out to be highly similar to the CENP-B box of human alpha satDNA, identical in 10-12 out of 17 nucleotides. In addition, organization of nematode satDNAs was comparable to that found in alpha satDNA of human and primates, characterized by monomers concurrently arranged in simple and higher-order repeat (HOR) arrays. In contrast to alpha satDNA, phylogenetic clustering of nematode satDNA monomers extracted either from simple or from HOR array indicated frequent shuffling between these two organizational forms. Comparison of homogeneous simple arrays and complex HORs composed of different satDNAs, enabled, for the first time, the identification of conserved motifs as obligatory components of monomer junctions. This observation highlights the role of short motifs in rearrangements, even among highly divergent sequences. Two mechanisms are proposed to be involved in this process, i.e., putative transposition-related cut-and-paste insertions and/or illegitimate recombination. Possibility for involvement of the nematode CENP-B box-like sequence in the transposition-related mechanism and together with previously established similarity of the human CENP-B protein and pogo-like transposases implicate a novel role of the CENP-B box and related sequence motifs in addition to the known function in centromere protein binding.
Collapse
Affiliation(s)
- Nevenka Meštrović
- Department of Molecular Biology, Rudjer Bošković Institute, Zagreb, Croatia
| | | | | | | | | | | |
Collapse
|
16
|
Melters DP, Bradnam KR, Young HA, Telis N, May MR, Ruby JG, Sebra R, Peluso P, Eid J, Rank D, Garcia JF, DeRisi JL, Smith T, Tobias C, Ross-Ibarra J, Korf I, Chan SWL. Comparative analysis of tandem repeats from hundreds of species reveals unique insights into centromere evolution. Genome Biol 2013; 14:R10. [PMID: 23363705 PMCID: PMC4053949 DOI: 10.1186/gb-2013-14-1-r10] [Citation(s) in RCA: 337] [Impact Index Per Article: 28.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2012] [Accepted: 01/30/2013] [Indexed: 01/01/2023] Open
Abstract
Background Centromeres are essential for chromosome segregation, yet their DNA sequences evolve rapidly. In most animals and plants that have been studied, centromeres contain megabase-scale arrays of tandem repeats. Despite their importance, very little is known about the degree to which centromere tandem repeats share common properties between different species across different phyla. We used bioinformatic methods to identify high-copy tandem repeats from 282 species using publicly available genomic sequence and our own data. Results Our methods are compatible with all current sequencing technologies. Long Pacific Biosciences sequence reads allowed us to find tandem repeat monomers up to 1,419 bp. We assumed that the most abundant tandem repeat is the centromere DNA, which was true for most species whose centromeres have been previously characterized, suggesting this is a general property of genomes. High-copy centromere tandem repeats were found in almost all animal and plant genomes, but repeat monomers were highly variable in sequence composition and length. Furthermore, phylogenetic analysis of sequence homology showed little evidence of sequence conservation beyond approximately 50 million years of divergence. We find that despite an overall lack of sequence conservation, centromere tandem repeats from diverse species showed similar modes of evolution. Conclusions While centromere position in most eukaryotes is epigenetically determined, our results indicate that tandem repeats are highly prevalent at centromeres of both animal and plant genomes. This suggests a functional role for such repeats, perhaps in promoting concerted evolution of centromere DNA across chromosomes.
Collapse
|
17
|
Tsai WL, Forbes JG, Wang K. Engineering of an elastic scaffolding polyprotein based on an SH3-binding intrinsically disordered titin PEVK module. Protein Expr Purif 2012; 85:187-99. [PMID: 22910563 PMCID: PMC3463739 DOI: 10.1016/j.pep.2012.08.003] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2012] [Revised: 07/24/2012] [Accepted: 08/03/2012] [Indexed: 01/21/2023]
Abstract
Titin is a large elastic protein found in muscle that maintains the elasticity and structural integrity of the sarcomere. The PEVK region of titin is intrinsically disordered, highly elastic and serves as a hub to bind signaling proteins. Systematic investigation of the structure and affinity profile of the PEVK region will provide important information about the functions of titin. Since PEVK is highly heterogeneous due to extensive differential splicing from more than one hundred exons, we engineered and expressed polyproteins that consist of a defined number of identical single exon modules. These customized polyproteins reduce heterogeneity, amplify interactions of less dominant modules, and most importantly, provide tags for atomic force microscopy and allow more readily interpretable data from single-molecule techniques. Expression and purification of recombinant polyprotein with repeat regions presented many technical challenges: recombination events in tandem repeats of identical DNA sequences exacerbated by high GC content, toxicity of polymer plasmid and expressed protein to the bacteria; early truncation of proteins expressed with different numbers of modules; and extreme sensitivity to proteolysis. We have investigated a number of in vitro and in vivo bacterial and yeast expression systems, as well as baculoviral systems as potential solutions to these problems. We successfully expressed and purified in gram quantities a polyprotein derived from human titin exon 172 using Pichia pastoris yeast. This study provides valuable insights into the technical challenges regarding the engineering and purification of a tandem repeat sequence of an intrinsically disordered biopolymer.
Collapse
Affiliation(s)
- Wanxia Li Tsai
- Muscle Proteomics and Nanotechnology Section, Laboratory of Muscle Biology, National Institute of Arthritis and Musculoskeletal and Skin Diseases, NIH/DHHS, Bethesda, MD 20892-8024, USA.
| | | | | |
Collapse
|
18
|
Abstract
Human centromeres are defined by megabases of homogenous alpha-satellite DNA arrays that are packaged into specialized chromatin marked by the centromeric histone variant, centromeric protein A (CENP-A). Although most human chromosomes have a single higher-order repeat (HOR) array of alpha satellites, several chromosomes have more than one HOR array. Homo sapiens chromosome 17 (HSA17) has two juxtaposed HOR arrays, D17Z1 and D17Z1-B. Only D17Z1 has been linked to CENP-A chromatin assembly. Here, we use human artificial chromosome assembly assays to show that both D17Z1 and D17Z1-B can support de novo centromere assembly independently. We extend these in vitro studies and demonstrate, using immunostaining and chromatin analyses, that in human cells the centromere can be assembled at D17Z1 or D17Z1-B. Intriguingly, some humans are functional heterozygotes, meaning that CENP-A is located at a different HOR array on the two HSA17 homologs. The site of CENP-A assembly on HSA17 is stable and is transmitted through meiosis, as evidenced by inheritance of CENP-A location through multigenerational families. Differences in histone modifications are not linked clearly with active and inactive D17Z1 and D17Z1-B arrays; however, we detect a correlation between the presence of variant repeat units of D17Z1 and CENP-A assembly at the opposite array, D17Z1-B. Our studies reveal the presence of centromeric epialleles on an endogenous human chromosome and suggest genomic complexities underlying the mechanisms that determine centromere identity in humans.
Collapse
|
19
|
Pertile MD, Graham AN, Choo KHA, Kalitsis P. Rapid evolution of mouse Y centromere repeat DNA belies recent sequence stability. Genome Res 2009; 19:2202-13. [PMID: 19737860 DOI: 10.1101/gr.092080.109] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
The Y centromere sequence of house mouse, Mus musculus, remains unknown despite our otherwise significant knowledge of the genome sequence of this important mammalian model organism. Here, we report the complete molecular characterization of the C57BL/6J chromosome Y centromere, which comprises a highly diverged minor satellite-like sequence (designated Ymin) with higher-order repeat (HOR) sequence organization previously undescribed at mouse centromeres. The Ymin array is approximately 90 kb in length and resides within a single BAC clone that provides sequence information spanning an endogenous animal centromere for the first time. By exploiting direct patrilineal inheritance of the Y chromosome, we demonstrate stability of the Y centromere DNA structure spanning at least 175 inbred generations to beyond the time of domestication of the East Asian M.m. molossinus "fancy" mouse through which the Y chromosome was first introduced into the classical inbred laboratory mouse strains. Despite this stability, at least three unequal genetic exchange events have altered Ymin HOR unit length and sequence structure since divergence of the ancestral Mus musculus subspecies around 900,000 yr ago, with major turnover of the HOR arrays driving rapid divergence of sequence and higher-order structure at the mouse Y centromere. A comparative sequence analysis between the human and chimpanzee centromeres indicates a similar rapid divergence of the primate Y centromere. Our data point to a unique DNA sequence and organizational architecture for the mouse Y centromere that has evolved independently of all other mouse centromeres.
Collapse
Affiliation(s)
- Mark D Pertile
- Murdoch Childrens Research Institute, Victoria, Australia
| | | | | | | |
Collapse
|
20
|
Takasuka TE, Cioffi A, Stein A. Sequence information encoded in DNA that may influence long-range chromatin structure correlates with human chromosome functions. PLoS One 2008; 3:e2643. [PMID: 18612465 PMCID: PMC2440353 DOI: 10.1371/journal.pone.0002643] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2008] [Accepted: 06/11/2008] [Indexed: 11/18/2022] Open
Abstract
Little is known about the possible function of the bulk of the human genome. We have recently shown that long-range regular oscillation in the motif non-T, A/T, G (VWG) existing at ten-nucleotide multiples influences large-scale nucleosome array formation. In this work, we have determined the locations of all 100 kb regions that are predicted to form distinctive chromatin structures throughout each human chromosome (except Y). Using these data, we found that a significantly greater fraction of 300 kb sequences lacked annotated transcripts in genomic DNA regions > or = 300 kb that contained nearly continuous chromatin organizing signals than in control regions. We also found a relationship between the meiotic recombination frequency and the presence of strong VWG chromatin organizing signals. Large (> or = 300 kb) genomic DNA regions having low average recombination frequency are enriched in chromatin organizing signals. As additional controls, we show using chromosome 1 that the VWG motif signals are not enriched in randomly selected DNA regions having the mean size of the recombination coldspots, and that non-VWG motif sets do not generate signals that are enriched in recombination coldspots. We also show that tandemly repeated alpha satellite DNA contains strong VWG signals for the formation of distinctive nucleosome arrays, consistent with the low recombination activity of centromeres. Our correlations cannot be explained simply by variations in the GC content. Our findings suggest that a specific set of periodic DNA motifs encoded in genomic DNA, which provide signals for chromatin organization, influence human chromosome function.
Collapse
Affiliation(s)
- Taichi E. Takasuka
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, United States of America
| | - Alfred Cioffi
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, United States of America
| | - Arnold Stein
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, United States of America
- * E-mail:
| |
Collapse
|
21
|
Abstract
Centromeres are special structures of eukaryotic chromosomes that hold sister chromatid together and ensure proper chromosome segregation during cell division. Centromeres consist of repeated sequences, which have hindered the study of centromere mitotic recombination and its consequences for centromeric function. We use a chromosome orientation fluorescence in situ hybridization technique to visualize and quantify recombination events at mouse centromeres. We show that centromere mitotic recombination occurs in normal cells to a higher frequency than telomere recombination and to a much higher frequency than chromosome-arm recombination. Furthermore, we show that centromere mitotic recombination is increased in cells lacking the Dnmt3a and Dnmt3b DNA methyltransferases, suggesting that the epigenetic state of centromeric heterochromatin controls recombination events at these regions. Increased centromere recombination in Dnmt3a,3b-deficient cells is accompanied by changes in the length of centromere repeats, suggesting that prevention of illicit centromere recombination is important to maintain centromere integrity in the mouse.
Collapse
Affiliation(s)
- Isabel Jaco
- Telomeres and Telomerase Group, Molecular Oncology Program, Spanish National Cancer Centre, 28029 Madrid, Spain
| | | | | | | |
Collapse
|
22
|
Rosandić M, Paar V, Basar I, Gluncić M, Pavin N, Pilas I. CENP-B box and pJalpha sequence distribution in human alpha satellite higher-order repeats (HOR). Chromosome Res 2006; 14:735-53. [PMID: 17115329 DOI: 10.1007/s10577-006-1078-x] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2005] [Accepted: 06/03/2006] [Indexed: 01/13/2023]
Abstract
Using our Key String Algorithm (KSA) to analyze Build 35.1 assembly we determined consensus alpha satellite higher-order repeats (HOR) and consensus distributions of CENP-B box and pJalpha motif in human chromosomes 1, 4, 5, 7, 8, 10, 11, 17, 19, and X. We determined new suprachromosomal family (SF) assignments: SF5 for 13mer (2211 bp), SF5 for 13mer (2214 bp), SF2 for 11mer (1869 bp), SF1 for 18mer (3058 bp), SF3 for 12mer (2047 bp), SF3 for 14mer (2379 bp), and SF5 for 17mer (2896 bp) in chromosomes 4, 5, 8, 10, 11, 17, and 19, respectively. In chromosome 5 we identified SF5 13mer without any CENP-B box and pJalpha motif, highly homologous (96%) to 13mer in chromosome 19. Additionally, in chromosome 19 we identified new SF5 17mer with one CENP-B box and pJalpha motif, aligned to 13mer by deleting four monomers. In chromosome 11 we identified SF3 12mer, homologous to 12mer in chromosome X. In chromosome 10 we identified new SF1 18mer with eight CENP-B boxes in every other monomer (except one). In chromosome 4 we identified new SF5 13mer with CENP-B box in three consecutive monomers. We found four exceptions to the rule that CENP-B box belongs to type B and pJalpha motif to type A monomers.
Collapse
Affiliation(s)
- Marija Rosandić
- Department of Internal Medicine, University Hospital Rebro, University of Zagreb, 10000, Zagreb, Croatia
| | | | | | | | | | | |
Collapse
|
23
|
Kolas NK, Svetlanov A, Lenzi ML, Macaluso FP, Lipkin SM, Liskay RM, Greally J, Edelmann W, Cohen PE. Localization of MMR proteins on meiotic chromosomes in mice indicates distinct functions during prophase I. ACTA ACUST UNITED AC 2005; 171:447-58. [PMID: 16260499 PMCID: PMC2171243 DOI: 10.1083/jcb.200506170] [Citation(s) in RCA: 100] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Mammalian MutL homologues function in DNA mismatch repair (MMR) after replication errors and in meiotic recombination. Both functions are initiated by a heterodimer of MutS homologues specific to either MMR (MSH2-MSH3 or MSH2-MSH6) or crossing over (MSH4-MSH5). Mutations of three of the four MutL homologues (Mlh1, Mlh3, and Pms2) result in meiotic defects. We show herein that two distinct complexes involving MLH3 are formed during murine meiosis. The first is a stable association between MLH3 and MLH1 and is involved in promoting crossing over in conjunction with MSH4-MSH5. The second complex involves MLH3 together with MSH2-MSH3 and localizes to repetitive sequences at centromeres and the Y chromosome. This complex is up-regulated in Pms2-/- males, but not females, providing an explanation for the sexual dimorphism seen in Pms2-/- mice. The association of MLH3 with repetitive DNA sequences is coincident with MSH2-MSH3 and is decreased in Msh2-/- and Msh3-/- mice, suggesting a novel role for the MMR family in the maintenance of repeat unit integrity during mammalian meiosis.
Collapse
Affiliation(s)
- Nadine K Kolas
- Department of Molecular Genetics, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
24
|
Schueler MG, Dunn JM, Bird CP, Ross MT, Viggiano L, Rocchi M, Willard HF, Green ED. Progressive proximal expansion of the primate X chromosome centromere. Proc Natl Acad Sci U S A 2005; 102:10563-8. [PMID: 16030148 PMCID: PMC1180780 DOI: 10.1073/pnas.0503346102] [Citation(s) in RCA: 64] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Previous studies of the pericentromeric region of the human X chromosome short arm (Xp) revealed an age gradient from ancient DNA that contains expressed genes to recent human-specific DNA at the functional centromere. We analyzed the finished sequence of this human genomic region to investigate its evolutionary history. Phylogenetic analysis of >1,500 alpha-satellite monomers from the region revealed the presence of five physical domains, each containing monomers from a distinct phylogenetic clade. The most distal domain contains long interspersed nucleotide element repeats that were active >35 million years ago, whereas the four proximal domains contain more recently active long interspersed nucleotide element repeats. An out-of-register, unequal recombination (i.e., crossover) detected at the edge of the X chromosome-specific alpha-satellite array (DXZ1) may reflect the most recent of a series of punctuating events during evolution that resulted in a proximal physical expansion of the X centromere. The first 18 kb of this array has 97-99% pairwise identity among all 2-kb repeat units. To perform more detailed evolutionary comparisons, we sequenced the junction between the ancient DNA of Xp and the primate-specific alpha satellite in chimpanzee, gorilla, orangutan, vervet, macaque, and baboon. The striking conservation found in all cases supports the ancestral nature of the alpha satellite at this location. These studies demonstrate that the primate X centromere appears to have evolved through repeated expansion events occurring within the central, active region of centromeric DNA, with the newly added sequences then conferring centromere function.
Collapse
Affiliation(s)
- Mary G Schueler
- Genome Technology Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | | | | | | | | | | | | | | |
Collapse
|
25
|
Hall SE, Luo S, Hall AE, Preuss D. Differential rates of local and global homogenization in centromere satellites from Arabidopsis relatives. Genetics 2005; 170:1913-27. [PMID: 15937135 PMCID: PMC1449784 DOI: 10.1534/genetics.104.038208] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Higher eukaryotic centromeres contain thousands of satellite repeats organized into tandem arrays. As species diverge, new satellite variants are homogenized within and between chromosomes, yet the processes by which particular sequences are dispersed are poorly understood. Here, we isolated and analyzed centromere satellites in plants separated from Arabidopsis thaliana by 5-20 million years, uncovering more rapid satellite divergence compared to primate alpha-satellite repeats. We also found that satellites derived from the same genomic locus were more similar to each other than satellites derived from disparate genomic regions, indicating that new sequence alterations were homogenized more efficiently at a local, rather than global, level. Nonetheless, the presence of higher-order satellite arrays, similar to those identified in human centromeres, indicated limits to local homogenization and suggested that sequence polymorphisms may play important functional roles. In two species, we defined more extensive polymorphisms, identifying physically separated and highly distinct satellite types. Taken together, these data show that there is a balance between plant satellite homogenization and the persistence of satellite variants. This balance could ultimately generate sufficient sequence divergence to cause mating incompatibilities between plant species, while maintaining adequate conservation within a species for centromere activity.
Collapse
MESH Headings
- Amino Acid Sequence
- Arabidopsis/genetics
- Base Sequence
- Centromere/genetics
- Chromatin Immunoprecipitation
- Consensus Sequence
- DNA, Plant/analysis
- DNA, Satellite/genetics
- DNA, Satellite/metabolism
- Fluorescein-5-isothiocyanate
- Fluorescent Antibody Technique, Direct
- Fluorescent Dyes
- Genome, Plant
- Heterochromatin/metabolism
- In Situ Hybridization, Fluorescence
- Indoles
- Microscopy, Fluorescence
- Molecular Sequence Data
- Phylogeny
- Sequence Analysis, DNA
- Sequence Homology, Amino Acid
- Sequence Homology, Nucleic Acid
Collapse
Affiliation(s)
- Sarah E Hall
- Howard Hughes Medical Institute, University of Chicago, Chicago, Illinois 60637, USA
| | | | | | | |
Collapse
|
26
|
Basu J, Stromberg G, Compitello G, Willard HF, Van Bokkelen G. Rapid creation of BAC-based human artificial chromosome vectors by transposition with synthetic alpha-satellite arrays. Nucleic Acids Res 2005; 33:587-96. [PMID: 15673719 PMCID: PMC548352 DOI: 10.1093/nar/gki207] [Citation(s) in RCA: 54] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023] Open
Abstract
Efficient construction of BAC-based human artificial chromosomes (HACs) requires optimization of each key functional unit as well as development of techniques for the rapid and reliable manipulation of high-molecular weight BAC vectors. Here, we have created synthetic chromosome 17-derived alpha-satellite arrays, based on the 16-monomer repeat length typical of natural D17Z1 arrays, in which the consensus CENP-B box elements are either completely absent (0/16 monomers) or increased in density (16/16 monomers) compared to D17Z1 alpha-satellite (5/16 monomers). Using these vectors, we show that the presence of CENP-B box elements is a requirement for efficient de novo centromere formation and that increasing the density of CENP-B box elements may enhance the efficiency of de novo centromere formation. Furthermore, we have developed a novel, high-throughput methodology that permits the rapid conversion of any genomic BAC target into a HAC vector by transposon-mediated modification with synthetic alpha-satellite arrays and other key functional units. Taken together, these approaches offer the potential to significantly advance the utility of BAC-based HACs for functional annotation of the genome and for applications in gene transfer.
Collapse
Affiliation(s)
- Joydeep Basu
- Institute for Genome Sciences and Policy, Duke University CIEMAS Room 2379, 101 Science Drive, Durham, NC 27708, USA.
| | | | | | | | | |
Collapse
|
27
|
Clemente M, de Miguel N, Lia VV, Matrajt M, Angel SO. Structure analysis of two Toxoplasma gondii and Neospora caninum satellite DNA families and evolution of their common monomeric sequence. J Mol Evol 2004; 58:557-67. [PMID: 15170259 DOI: 10.1007/s00239-003-2578-3] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2003] [Accepted: 11/26/2003] [Indexed: 10/26/2022]
Abstract
A family of repetitive DNA elements of approximately 350 bp-Sat350-that are members of Toxoplasma gondii satellite DNA was further analyzed. Sequence analysis identified at least three distinct repeat types within this family, called types A, B, and C. B repeats were divided into the subtypes B1 and B2. A search for internal repetitions within this family permitted the identification of conserved regions and the design of PCR primers that amplify almost all these repetitive elements. These primers amplified the expected 350-bp repeats and a novel 680-bp repetitive element (Sat680) related to this family. Two additional tandemly repeated high-order structures corresponding to this satellite DNA family were found by searching the Toxoplasma genome database with these sequences. These studies were confirmed by sequence analysis and identified: (1). an arrangement of AB1CB2 350-bp repeats and (2). an arrangement of two 350-bp-like repeats, resulting in a 680-bp monomer. Sequence comparison and phylogenetic analysis indicated that both high-order structures may have originated from the same ancestral 350-bp repeat. PCR amplification, sequence analysis and Southern blot showed that similar high-order structures were also found in the Toxoplasma-sister taxon Neospora caninum. The Toxoplasma genome database (http://ToxoDB.org ) permitted the assembly of a contig harboring Sat350 elements at one end and a long nonrepetitive DNA sequence flanking this satellite DNA. The region bordering the Sat350 repeats contained two differentially expressed sequence-related regions and interstitial telomeric sequences.
Collapse
|
28
|
Kawabe A, Nasuda S. Structure and genomic organization of centromeric repeats in Arabidopsis species. Mol Genet Genomics 2004; 272:593-602. [PMID: 15586291 DOI: 10.1007/s00438-004-1081-x] [Citation(s) in RCA: 39] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2004] [Accepted: 10/05/2004] [Indexed: 10/26/2022]
Abstract
Centromeric repetitive sequences were isolated from Arabidopsis halleri ssp. gemmifera and A. lyrata ssp. kawasakiana. Two novel repeat families isolated from A. gemmifera were designated pAge1 and pAge2. These repeats are 180 bp in length and are organized in a head-to-tail manner. They are similar to the pAL1 repeats of A. thaliana and the pAa units of A. arenosa. Both A. gemmifera and A. kawasakiana possess the pAa, pAge1 and pAge2 repeat families. Sequence comparisons of different centromeric repeats revealed that these families share a highly conserved region of approximately 50 bp. Within each of the four repeat families, two or three regions showed low levels of sequence variation. The average difference in nucleotide sequence was approximately 10% within families and 30% between families, which resulted in clear distinctions between families upon phylogenetic analysis. FISH analysis revealed that the localization patterns for the pAa, pAge1 and pAge2 families were chromosome specific in A. gemmifera and A. kawasakiana. In one pair of chromosomes in A. gemmifera, and three pairs of chromosomes in A. kawasakiana, two repeat families were present. The presence of three families of centromeric repeats in A. gemmifera and A. kawasakiana indicates that the first step toward homogenization of centromeric repeats occurred at the chromosome level.
Collapse
Affiliation(s)
- A Kawabe
- Laboratory of Plant Genetics, Graduate School of Agriculture, Kyoto University, 606-8502, Kyoto, Japan
| | | |
Collapse
|
29
|
Kato M. Evaluation of intra- and interspecific divergence of satellite DNA sequences by nucleotide frequency calculation and pairwise sequence comparison. Biol Proced Online 2003; 5:63-68. [PMID: 12734555 PMCID: PMC152575 DOI: 10.1251/bpo47] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2002] [Revised: 01/19/2003] [Accepted: 02/06/2003] [Indexed: 11/24/2022] Open
Abstract
Satellite DNA sequences are known to be highly variable and to have been subjected to concerted evolution that homogenizes member sequences within species. We have analyzed the mode of evolution of satellite DNA sequences in four fishes from the genus Diplodus by calculating the nucleotide frequency of the sequence array and the phylogenetic distances between member sequences. Calculation of nucleotide frequency and pairwise sequence comparison enabled us to characterize the divergence among member sequences in this satellite DNA family. The results suggest that the evolutionary rate of satellite DNA in D. bellottii is about two-fold greater than the average of the other three fishes, and that the sequence homogenization event occurred in D. puntazzo more recently than in the others. The procedures described here are effective to characterize mode of evolution of satellite DNA.
Collapse
Affiliation(s)
- Mikio Kato
- Department of Life Sciences, Osaka Prefecture University. 1-1 Gakuencho, Sakai 599-8531. Japan
| |
Collapse
|
30
|
Ogata N, Morino H. Elongation of repetitive DNA by DNA polymerase from a hyperthermophilic bacterium Thermus thermophilus. Nucleic Acids Res 2000; 28:3999-4004. [PMID: 11024180 PMCID: PMC110782 DOI: 10.1093/nar/28.20.3999] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Short repetitive DNA sequences are believed to be one of the primordial genetic elements that served as a source of complex large DNA found in the genome of modern organisms. However, the mechanism of its expansion (increase in repeat number) during the course of evolution is unclear. We demonstrate that the DNA polymerase of the hyperthermophilic bacterium Thermus thermophilus can elongate oligoDNA with several tandem repeats to very long DNA in vitro. For instance, 48mer repetitive oligoDNA (TACATGTA)(6), which has 25% GC content and a palindromic sequence, can be elongated up to approximately 10 000 bases by DNA polymerase at 74 degrees C without template DNA. OligoDNA having a different GC content or a quasi-palindromic sequence can also be elongated, but less efficiently. A spectroscopic thermal melting experiment with the oligoDNA showed that its hairpin-coil transition temperature was very close to the elongation reaction temperature (74 degrees C), but was much higher than the temperature at which duplex oligoDNA can exist stably. Taken together, we conclude that repetitive oligoDNA with a palindromic or quasi-palindromic sequence is elongated extensively by a hyperthermophilic DNA polymerase through hairpin-coil transitions. We propose that such an elongation mechanism might have been a driving force to expand primordial short DNA.
Collapse
Affiliation(s)
- N Ogata
- Taiko Pharmaceutical Co., Ltd, 3-34-14 Uchihonmachi, Suita, Osaka 564-0032, Japan.
| | | |
Collapse
|
31
|
Laurent AM, Puechberty J, Roizès G. Hypothesis: for the worst and for the best, L1Hs retrotransposons actively participate in the evolution of the human centromeric alphoid sequences. Chromosome Res 1999; 7:305-17. [PMID: 10461876 DOI: 10.1023/a:1009283015738] [Citation(s) in RCA: 21] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
A number of questions concerning the evolution and the function of the alpha satellite DNA sequences present at the centromere of all human chromosomes are still open. In this paper, we present data which could contribute to understanding these points. It is shown here that the alphoid sequences within which L1 elements are found are quite divergent from those of the homogeneous alphoid subsets present at each centromere where none has so far been detected. In addition, a number of L1s are detected close to the ends of the alpha satellite blocks. A fairly high proportion exhibit a polymorphism of presence/absence. Strikingly, several L1s localized at a distance from each other are always either present or absent simultaneously. This is interpreted as resulting from intrachromosomal recombination, through distant L1s, leading to deletion of several of them at once together with their surrounding alphoid sequences. The parameters determining which portion of the several megabases of alphoid sequences is actually involved in the centromeric function are not known. From the above data we suggest that the alpha satellite domain within which DNA sequences are recruited to form a centromere is both homogeneous in sequence and uninterrupted by L1s or any other retrotransposons. Conversely, non-centromere competent alphoid sequences would be both divergent and punctuated by scattered L1 elements, particularly at the borders of the alphoid blocks. On the grounds of these data and hypotheses, a model is presented in which it is postulated that accumulation of L1 insertions within a centromere competent alphoid domain is ruining this competence, the consequence being damage to or even loss of the centromere-forming capability of the chromosome. Restoration of fully centromere-forming competence is supposed to occur by two alternative means, either de-novo amplification of a homogeneous and uninterrupted alphoid domain or by unequal crossing over with a homologue harbouring a large competent one. If L1 retrotransposons are acting detrimentally to centromere integrity (for the worst), one must also consider them as having positive consequences on chromosomes by preventing their centromeres from swelling indefinitely by the addition of alphoid sequences (for the best). The data and ideas presented here fit well with those already put forward by Csink and Henikoff (1998) using the example of Drosophila.
Collapse
Affiliation(s)
- A M Laurent
- Séquences répétées et centromères humains, Institut de Génétique Humaine UPR 1142, Institut de Biologie, Montpellier, France
| | | | | |
Collapse
|
32
|
López CC, Edström JE. Interspersed centromeric element with a CENP-B box-like motif in Chironomus pallidivittatus. Nucleic Acids Res 1998; 26:4168-72. [PMID: 9722636 PMCID: PMC147845 DOI: 10.1093/nar/26.18.4168] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Short mobile elements are present in different recombined forms as interspersed GC-rich islands between AT rich centromeric 155 bp tandem repeats in the dipteran Chironomus pallidivittatus . The basic element is 80 bp long, has a pronounced invert repeat structure and contains a 17 bp segment similar to the CENP-B box in mammals. The element inserts into a specific site of the 155 bp repeat in a defined orientation surrounded by 2 bp direct repeats. The total number per genome of the main variant is <20. Elements can be present in all centromeres from C.pallidivittatus and the sibling species Chironomus tentans with pronounced differences in distribution within and between species.
Collapse
Affiliation(s)
- C C López
- Department of Genetics, Lund University, Sölvegatan 29, S-22362 Lund, Sweden
| | | |
Collapse
|
33
|
Yoda K, Okazaki T. Site-specific base deletions in human alpha-satellite monomer DNAs are associated with regularly distributed CENP-B boxes. Chromosome Res 1997; 5:207-11. [PMID: 9246417 DOI: 10.1023/a:1018407316908] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Affiliation(s)
- K Yoda
- Bioscience Center, Nagoya University, Japan.
| | | |
Collapse
|
34
|
Goldberg IG, Sawhney H, Pluta AF, Warburton PE, Earnshaw WC. Surprising deficiency of CENP-B binding sites in African green monkey alpha-satellite DNA: implications for CENP-B function at centromeres. Mol Cell Biol 1996; 16:5156-68. [PMID: 8756673 PMCID: PMC231516 DOI: 10.1128/mcb.16.9.5156] [Citation(s) in RCA: 52] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open
Abstract
Centromeres of mammalian chromosomes are rich in repetitive DNAs that are packaged into specialized nucleoprotein structures called heterochromatin. In humans, the major centromeric repetitive DNA, alpha-satellite DNA, has been extensively sequenced and shown to contain binding sites for CENP-B, an 80-kDa centromeric autoantigen. The present report reveals that African green monkey (AGM) cells, which contain extensive alpha-satellite arrays at centromeres, appear to lack the well-characterized CENP-B binding site (the CENP-B box). We show that AGM cells express a functional CENP-B homolog that binds to the CENP-B box and is recognized by several independent anti-CENP-B antibodies. However, three independent assays fail to reveal CENP-B binding sites in AGM DNA. Methods used include a gel mobility shift competition assay using purified AGM alpha-satellite, a novel kinetic electrophoretic mobility shift assay competition protocol using bulk genomic DNA, and bulk sequencing of 76 AGM alpha-satellite monomers. Immunofluorescence studies reveal the presence of significant levels of CENP-B antigen dispersed diffusely throughout the nuclei of interphase cells. These experiments reveal a paradox. CENP-B is highly conserved among mammals, yet its DNA binding site is conserved in human and mouse genomes but not in the AGM genome. One interpretation of these findings is that the role of CENP-B may be in the maintenance and/or organization of centromeric satellite DNA arrays rather than a more direct involvement in centromere structure.
Collapse
Affiliation(s)
- I G Goldberg
- Department of Cell Biology and Anatomy, Johns Hopkins School of Medicine, Baltimore, Maryland 21205, USA
| | | | | | | | | |
Collapse
|
35
|
Ohki R, Oishi M, Kiyama R. Preference of the recombination sites involved in the formation of extrachromosomal copies of the human alphoid Sau3A repeat family. Nucleic Acids Res 1995; 23:4971-7. [PMID: 8559653 PMCID: PMC307501 DOI: 10.1093/nar/23.24.4971] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023] Open
Abstract
The human alphoid Sau3A repetitive family DNA is one of the DNA species that are actively amplified to form extrachromosomal circular DNA in several cell lines. The circularization takes place between two of the five approximately 170 bp subunits with an average of 73.1% homology as well as between identical subunits. To investigate the nature of the recombination reaction, we cloned and analyzed the subunits containing recombination junctions. Analysis of a total of 68 junctions revealed that recombination had occurred preferentially at four positions 10-25 (A), 40-50 (B), 85-90 (C) and 135-160 (D) in the 170bp subunit structure. Two regions (B and C) were overlapped with the regions with higher homology between subunits, while other two regions (A and D) cannot be explained solely by the regional homology between the subunits. These regions were located at both junctions of the nucleosomal and the linker region, and overlapped with the binding motifs for alpha protein and CENP-B. Approximately 90% of the recombination occurred between the subunits located next but one (+/- 2 shift), although the frequency of recombination between the adjoining subunits (+/- 1 shift) was approximately 10%.
Collapse
Affiliation(s)
- R Ohki
- Institute of Molecular and Cellular Biosciences, University of Tokyo, Japan
| | | | | |
Collapse
|
36
|
Warburton PE, Willard HF. Interhomologue sequence variation of alpha satellite DNA from human chromosome 17: evidence for concerted evolution along haplotypic lineages. J Mol Evol 1995; 41:1006-15. [PMID: 8587099 DOI: 10.1007/bf00173182] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
Alpha satellite DNA is a family of tandemly repeated DNA found at the centromeres of all primate chromosomes. Different human chromosomes 17 in the population are characterized by distinct alpha satellite haplotypes, distinguished by the presence of variant repeat forms that have precise monomeric deletions. Pair-wise comparisons of sequence diversity between variant repeat units from each haplotype show that they are closely related in sequence. Direct sequencing of PCR-amplified alpha satellite reveals heterogeneous positions between the repeat units on a chromosome as two bands at the same position on a sequencing ladder. No variation was detected in the sequence and location of these heterogeneous positions between chromosomes 17 from the same haplotype, but distinct patterns of variation were detected between chromosomes from different haplotypes. Subsequent sequence analysis of individual repeats from each haplotype confirmed the presence of extensive haplotype-specific sequence variation. Phylogenetic inference yielded a tree that suggests these chromosome 17 repeat units evolve principally along haplotypic lineages. These studies allow insight into the relative rates and/or timing of genetic turnover processes that lead to the homogenization of tandem DNA families.
Collapse
Affiliation(s)
- P E Warburton
- Department of Genetics, Stanford University, CA 94305, USA
| | | |
Collapse
|
37
|
Abstract
The alpha satellite DNA of Old World (catarrhine) primates usually consists of similar, but not identical, ca. 170 bp sequences repeated tandemly hundreds to thousands of times. The 170 bp monomeric repeats are components of higher-order repeats, many of which are chromosome specific. Alpha satellites are found exclusively in centromeric regions where they appear to play a role in centromere function. We have found that alpha satellite DNA in neotropical (New World; platyrrhine) primates is very similar to its Old World counterpart: it consists of divergent ca. 170 bp subsequences that are arranged in tandem arrays with a ca. 340 bp periodicity. New and Old World alpha satellites share about 64% sequence identity overall, and contain several short sequence motifs that appear to be highly conserved. One exception to the tandemly arrayed 340 bp motif has been found: the major alpha satellite array in Chiropotes satanas (black bearded saki) has a 539 bp repeat unit that consists of a 338 bp dimer together with a duplication of 33 bp of the first monomeric unit and 168 bp of the second monomeric unit.
Collapse
Affiliation(s)
- G Alves
- Genetics Section, National Cancer Institute, Rio de Janeiro, Brazil
| | | | | |
Collapse
|