1
|
Uno Y, Matsubara K, Inoue J, Inazawa J, Shinohara A, Koshimoto C, Ichiyanagi K, Matsuda Y. Diversity and Evolution of Highly Repetitive DNA Sequences Constituting Chromosome Site-Specific Heterochromatin in Two Gerbillinae Species. Cytogenet Genome Res 2023; 163:42-51. [PMID: 37708873 DOI: 10.1159/000533716] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Accepted: 08/18/2023] [Indexed: 09/16/2023] Open
Abstract
Constitutive heterochromatin, consisting of repetitive sequences, diverges very rapidly; therefore, its nucleotide sequences and chromosomal distributions are often largely different, even between closely related species. The chromosome C-banding patterns of two Gerbillinae species, Meriones unguiculatus and Gerbillus perpallidus, vary greatly, even though they belong to the same subfamily. To understand the evolution of C-positive heterochromatin in these species, we isolated highly repetitive sequences, determined their nucleotide sequences, and characterized them using chromosomal and filter hybridization. We obtained a centromeric repeat (MUN-HaeIII) and a chromosome 13-specific repeat (MUN-EcoRI) from M. unguiculatus. We also isolated a centromeric/pericentromeric repeat (GPE-MBD) and an interspersed-type repeat that was predominantly amplified in the X and Y chromosomes (GPE-EcoRI) from G. perpallidus. GPE-MBD was found to contain a 17-bp motif that is essential for binding to the centromere-associated protein CENP-B. This indicates that it may play a role in the formation of a specified structure and/or function of centromeres. The nucleotide sequences of the three sequence families, except GPE-EcoRI, were conserved only in Gerbillinae. GPE-EcoRI was derived from the long interspersed nuclear elements 1 retrotransposon and showed sequence homology throughout Muridae and Cricetidae species, indicating that the repeat sequence occurred at least in the common ancestor of Muridae and Cricetidae. Due to a lack of assembly data of highly repetitive sequences constituting heterochromatin in whole-genome sequences of vertebrate species published to date, the knowledge obtained in this study provides useful information for a deep understanding of the evolution of repetitive sequences in not only rodents but also in mammals.
Collapse
Affiliation(s)
- Yoshinobu Uno
- Department of Life Sciences, Graduate School of Arts and Sciences, The University of Tokyo, Tokyo, Japan
| | - Kazumi Matsubara
- Department of Environmental Biology, College of Bioscience and Biotechnology, Chubu University, Kasugai, Japan
| | - Jun Inoue
- Department of Molecular Cytogenetics, Medical Research Institute, Tokyo Medical and Dental University, Tokyo, Japan
- Bioresource Research Center, Tokyo Medical and Dental University, Tokyo, Japan
| | - Johji Inazawa
- Department of Molecular Cytogenetics, Medical Research Institute, Tokyo Medical and Dental University, Tokyo, Japan
- Bioresource Research Center, Tokyo Medical and Dental University, Tokyo, Japan
| | - Akio Shinohara
- Department of Biotechnology, Frontier Science Research Center, University of Miyazaki, Miyazaki, Japan
| | - Chihiro Koshimoto
- Department of Biotechnology, Frontier Science Research Center, University of Miyazaki, Miyazaki, Japan
| | - Kenji Ichiyanagi
- Department of Animal Sciences, Graduate School of Bioagricultural Sciences, Nagoya University, Nagoya, Japan
| | - Yoichi Matsuda
- Department of Animal Sciences, Graduate School of Bioagricultural Sciences, Nagoya University, Nagoya, Japan
| |
Collapse
|
2
|
Paar V, Basar I, Rosandić M, Glunčić M. Consensus higher order repeats and frequency of string distributions in human genome. Curr Genomics 2007; 8:93-111. [PMID: 18660848 PMCID: PMC2435359 DOI: 10.2174/138920207780368169] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2007] [Revised: 01/26/2007] [Accepted: 01/30/2007] [Indexed: 02/01/2023] Open
Abstract
Key string algorithm (KSA) could be viewed as robust computational generalization of restriction enzyme method. KSA enables robust and effective identification and structural analyzes of any given genomic sequences, like in the case of NCBI assembly for human genome. We have developed a method, using total frequency distribution of all r-bp key strings in dependence on the fragment length l, to determine the exact size of all repeats within the given genomic sequence, both of monomeric and HOR type. Subsequently, for particular fragment lengths equal to each of these repeat sizes we compute the partial frequency distribution of r-bp key strings; the key string with highest frequency is a dominant key string, optimal for segmentation of a given genomic sequence into repeat units. We illustrate how a wide class of 3-bp key strings leads to a key-string-dependent periodic cell which enables a simple identification and consensus length determinations of HORs, or any other highly convergent repeat of monomeric or HOR type, both tandem or dispersed. We illustrated KSA application for HORs in human genome and determined consensus HORs in the Build 35.1 assembly. In the next step we compute suprachromosomal family classification and CENP-B box / pJalpha distributions for HORs. In the case of less convergent repeats, like for example monomeric alpha satellite (20-40% divergence), we searched for optimal compact key string using frequency method and developed a concept of composite key string (GAAAC--CTTTG) or flexible relaxation (28 bp key string) which provides both monomeric alpha satellites as well as alpha monomer segmentation of internal HOR structure. This method is convenient also for study of R-strand (direct) / S-strand (reverse complement) alpha monomer alternations. Using KSA we identified 16 alternating regions of R-strand and S-strand monomers in one contig in choromosome 7. Use of CENP-B box and/or pJalpha motif as key string is suitable both for identification of HORs and monomeric pattern as well as for studies of CENP-B box / pJalpha distribution. As an example of application of KSA to sequences outside of HOR regions we present our finding of a tandem with highly convergent 3434-bp Long monomer in chromosome 5 (divergence less then 0.3%).
Collapse
Affiliation(s)
- Vladimir Paar
- Faculty of Science, University of Zagreb, Bijenička 32, 10000 Zagreb, Croatia
| | - Ivan Basar
- Faculty of Science, University of Zagreb, Bijenička 32, 10000 Zagreb, Croatia
| | - Marija Rosandić
- Department of Internal Medicine,
University Hospital Rebro, Kišpatićeva 12, 10000 Zagreb, Croatia
| | - Matko Glunčić
- Faculty of Science, University of Zagreb, Bijenička 32, 10000 Zagreb, Croatia
| |
Collapse
|