1
|
Tandem Repeats in Bacillus: Unique Features and Taxonomic Distribution. Int J Mol Sci 2021; 22:ijms22105373. [PMID: 34065296 PMCID: PMC8161180 DOI: 10.3390/ijms22105373] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2021] [Revised: 05/14/2021] [Accepted: 05/18/2021] [Indexed: 11/16/2022] Open
Abstract
Little is known about DNA tandem repeats across prokaryotes. We have recently described an enigmatic group of tandem repeats in bacterial genomes with a constant repeat size but variable sequence. These findings strongly suggest that tandem repeat size in some bacteria is under strong selective constraints. Here, we extend these studies and describe tandem repeats in a large set of Bacillus. Some species have very few repeats, while other species have a large number. Most tandem repeats have repeats with a constant size (either 52 or 20-21 nt), but a variable sequence. We characterize in detail these intriguing tandem repeats. Individual species have several families of tandem repeats with the same repeat length and different sequence. This result is in strong contrast with eukaryotes, where tandem repeats of many sizes are found in any species. We discuss the possibility that they are transcribed as small RNA molecules. They may also be involved in the stabilization of the nucleoid through interaction with proteins. We also show that the distribution of tandem repeats in different species has a taxonomic significance. The data we present for all tandem repeats and their families in these bacterial species will be useful for further genomic studies.
Collapse
|
2
|
Unique Features of Tandem Repeats in Bacteria. J Bacteriol 2020; 202:JB.00229-20. [PMID: 32839174 DOI: 10.1128/jb.00229-20] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2020] [Accepted: 08/17/2020] [Indexed: 02/06/2023] Open
Abstract
DNA tandem repeats, or satellites, are well described in eukaryotic species, but little is known about their prevalence across prokaryotes. Here, we performed the most complete characterization to date of satellites in bacteria. We identified 121,638 satellites from 12,233 fully sequenced and assembled bacterial genomes with a very uneven distribution. We also determined the families of satellites which have a related sequence. There are 85 genomes that are particularly satellite rich and contain several families of satellites of yet unknown function. Interestingly, we only found two main types of noncoding satellites, depending on their repeat sizes, 22/44 or 52 nucleotides (nt). An intriguing feature is the constant size of the repeats in the genomes of different species, whereas their sequences show no conservation. Individual species also have several families of satellites with the same repeat length and different sequences. This result is in marked contrast with previous findings in eukaryotes, where noncoding satellites of many sizes are found in any species investigated. We describe in greater detail these noncoding satellites in the spirochete Leptospira interrogans and in several bacilli. These satellites undoubtedly play a specific role in the species which have acquired them. We discuss the possibility that they represent binding sites for transcription factors not previously described or that they are involved in the stabilization of the nucleoid through interaction with proteins.IMPORTANCE We found an enigmatic group of noncoding satellites in 85 bacterial genomes with a constant repeat size but variable sequence. This pattern of DNA organization is unique and had not been previously described in bacteria. These findings strongly suggest that satellite size in some bacteria is under strong selective constraints and thus that satellites are very likely to play a fundamental role. We also provide a list and properties of all satellites in 12,233 genomes, which may be used for further genomic analysis.
Collapse
|
3
|
Subirana JA, Albà MM, Messeguer X. High evolutionary turnover of satellite families in Caenorhabditis. BMC Evol Biol 2015; 15:218. [PMID: 26438045 PMCID: PMC4595182 DOI: 10.1186/s12862-015-0495-x] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2015] [Accepted: 09/22/2015] [Indexed: 02/07/2023] Open
Abstract
Background The high density of tandem repeat sequences (satellites) in nematode genomes and the availability of genome sequences from several species in the group offer a unique opportunity to better understand the evolutionary dynamics and the functional role of these sequences. We take advantage of the previously developed SATFIND program to study the satellites in four Caenorhabditis species and investigate these questions. Methods The identification and comparison of satellites is carried out in three steps. First we find all the satellites present in each species with the SATFIND program. Each satellite is defined by its length, number of repeats, and repeat sequence. Only satellites with at least ten repeats are considered. In the second step we build satellite families with a newly developed alignment program. Satellite families are defined by a consensus sequence and the number of satellites in the family. Finally we compare the consensus sequence of satellite families in different species. Results We give a catalog of individual satellites in each species. We have also identified satellite families with a related sequence and compare them in different species. We analyze the turnover of satellites: they increased in size through duplications of fragments of 100-300 bases. It appears that in many cases they have undergone an explosive expansion. In C. elegans we have identified a subset of large satellites that have strong affinity for the centromere protein CENP-A. We have also compared our results with those obtained from other species, including one nematode and three mammals. Conclusions Most satellite families found in Caenorhabditis are species-specific; in particular those with long repeats. A subset of these satellites may facilitate the formation of kinetochores in mitosis. Other satellite families in C. elegans are either related to Helitron transposons or to meiotic pairing centers. Electronic supplementary material The online version of this article (doi:10.1186/s12862-015-0495-x) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Juan A Subirana
- Department of Computer Science, Universitat Politècnica de Catalunya, Jordi Girona 31, Barcelona, 08034, Spain. .,Evolutionary Genomics Group, Research Programme on Biomedical Informatics (GRIB) - Hospital del Mar Research Institute (IMIM), Universitat Pompeu Fabra (UPF), Dr. Aiguader 86, Barcelona, 08003, Spain.
| | - M Mar Albà
- Evolutionary Genomics Group, Research Programme on Biomedical Informatics (GRIB) - Hospital del Mar Research Institute (IMIM), Universitat Pompeu Fabra (UPF), Dr. Aiguader 86, Barcelona, 08003, Spain.
| | - Xavier Messeguer
- Department of Computer Science, Universitat Politècnica de Catalunya, Jordi Girona 31, Barcelona, 08034, Spain.
| |
Collapse
|
4
|
Tiirikka T, Siermala M, Vihinen M. Clustering of gene ontology terms in genomes. Gene 2014; 550:155-64. [PMID: 24995610 DOI: 10.1016/j.gene.2014.06.060] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2012] [Revised: 06/26/2014] [Accepted: 06/27/2014] [Indexed: 01/08/2023]
Abstract
Although protein coding genes occupy only a small fraction of genomes in higher species, they are not randomly distributed within or between chromosomes. Clustering of genes with related function(s) and/or characteristics has been evident at several different levels. To study how common the clustering of functionally related genes is and what kind of functions the end products of these genes are involved, we collected gene ontology (GO) terms for complete genomes and developed a method to detect previously undefined gene clustering. Exhaustive analysis was performed for seven widely studied species ranging from human to Escherichia coli. To overcome problems related to varying gene lengths and densities, a novel method was developed and a fixed number of genes were analyzed irrespective of the genome span covered. Statistically very significant GO term clustering was apparent in all the investigated genomes. The analysis window, which ranged from 5 to 50 consecutive genes, revealed extensive GO term clusters for genes with widely varying functions. Here, the most interesting and significant results are discussed and the complete dataset for each analyzed species is available at the GOme database at http://bioinf.uta.fi/GOme. The results indicated that clusters of genes with related functions are very common, not only in bacteria, in which operons are frequent, but also in all the studied species irrespective of how complex they are. There are some differences between species but in all of them GO term clusters are common and of widely differing sizes. The presented method can be applied to analyze any genome or part of a genome for which descriptive features are available, and thus is not restricted to ontology terms. This method can also be applied to investigate gene and protein expression patterns. The results pave a way for further studies of mechanisms that shape genome structure and evolutionary forces related to them.
Collapse
Affiliation(s)
- Timo Tiirikka
- Institute of Biomedical Technology, University of Tampere, Finland and BioMediTech, FI-33014 Tampere, Finland; Medical Research Center Oulu, Oulu University Hospital, University of Oulu, Finland.
| | - Markku Siermala
- Institute of Biomedical Technology, University of Tampere, Finland and BioMediTech, FI-33014 Tampere, Finland.
| | - Mauno Vihinen
- Institute of Biomedical Technology, University of Tampere, Finland and BioMediTech, FI-33014 Tampere, Finland; Department of Experimental Medical Science, Lund University, SE-22 184 Lund, Sweden.
| |
Collapse
|
5
|
Toll-Riera M, Albà MM. Emergence of novel domains in proteins. BMC Evol Biol 2013; 13:47. [PMID: 23425224 PMCID: PMC3599535 DOI: 10.1186/1471-2148-13-47] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2012] [Accepted: 01/31/2013] [Indexed: 12/31/2022] Open
Abstract
Background Proteins are composed of a combination of discrete, well-defined, sequence domains, associated with specific functions that have arisen at different times during evolutionary history. The emergence of novel domains is related to protein functional diversification and adaptation. But currently little is known about how novel domains arise and how they subsequently evolve. Results To gain insights into the impact of recently emerged domains in protein evolution we have identified all human young protein domains that have emerged in approximately the past 550 million years. We have classified them into vertebrate-specific and mammalian-specific groups, and compared them to older domains. We have found 426 different annotated young domains, totalling 995 domain occurrences, which represent about 12.3% of all human domains. We have observed that 61.3% of them arose in newly formed genes, while the remaining 38.7% are found combined with older domains, and have very likely emerged in the context of a previously existing protein. Young domains are preferentially located at the N-terminus of the protein, indicating that, at least in vertebrates, novel functional sequences often emerge there. Furthermore, young domains show significantly higher non-synonymous to synonymous substitution rates than older domains using human and mouse orthologous sequence comparisons. This is also true when we compare young and old domains located in the same protein, suggesting that recently arisen domains tend to evolve in a less constrained manner than older domains. Conclusions We conclude that proteins tend to gain domains over time, becoming progressively longer. We show that many proteins are made of domains of different age, and that the fastest evolving parts correspond to the domains that have been acquired more recently.
Collapse
Affiliation(s)
- Macarena Toll-Riera
- Evolutionary Genomics Group, Research Programme on Biomedical Informatics (GRIB) - Hospital del Mar Research Institute (IMIM), Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | | |
Collapse
|
6
|
Seplyarskiy VB, Kharchenko P, Kondrashov AS, Bazykin GA. Heterogeneity of the transition/transversion ratio in Drosophila and Hominidae genomes. Mol Biol Evol 2012; 29:1943-55. [PMID: 22337862 DOI: 10.1093/molbev/mss071] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Mutation rate varies between sites in the genome. Part of this variation can be explained by well-recognized short nucleotide contexts, but a large component of this variation remains cryptic. We used data on interspecies divergence and intraspecies polymorphism in Drosophila and Hominidae to analyze variation of the average rate of the 12 possible kinds of single-nucleotide mutations and in the transition/transversion ratio κ at single-nucleotide resolution. Both the average mutation rate and κ vary by a factor of ~3 between nucleotide sites. The characteristic scale of variation in κ is up to at least ~30 nucleotides in Drosophila and ~5 nucleotides in Hominidae. Genome segments with locally elevated mutation rates possess lower values of κ; however, a substantial fraction of variation in κ cannot be directly explained by the local mutation rates.
Collapse
Affiliation(s)
- Vladimir B Seplyarskiy
- Department of Bioengineering and Bioinformatics, Moscow State University, Moscow, Russia.
| | | | | | | |
Collapse
|
7
|
Al-Shahrour F, Minguez P, Marqués-Bonet T, Gazave E, Navarro A, Dopazo J. Selection upon genome architecture: conservation of functional neighborhoods with changing genes. PLoS Comput Biol 2010; 6:e1000953. [PMID: 20949098 PMCID: PMC2951340 DOI: 10.1371/journal.pcbi.1000953] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2009] [Accepted: 09/08/2010] [Indexed: 11/19/2022] Open
Abstract
An increasing number of evidences show that genes are not distributed randomly across eukaryotic chromosomes, but rather in functional neighborhoods. Nevertheless, the driving force that originated and maintains such neighborhoods is still a matter of controversy. We present the first detailed multispecies cartography of genome regions enriched in genes with related functions and study the evolutionary implications of such clustering. Our results indicate that the chromosomes of higher eukaryotic genomes contain up to 12% of genes arranged in functional neighborhoods, with a high level of gene co-expression, which are consistently distributed in phylogenies. Unexpectedly, neighborhoods with homologous functions are formed by different (non-orthologous) genes in different species. Actually, instead of being conserved, functional neighborhoods present a higher degree of synteny breaks than the genome average. This scenario is compatible with the existence of selective pressures optimizing the coordinated transcription of blocks of functionally related genes. If these neighborhoods were broken by chromosomal rearrangements, selection would favor further rearrangements reconstructing other neighborhoods of similar function. The picture arising from this study is a dynamic genomic landscape with a high level of functional organization. We describe here the most extensive functional cartography of the genomes of multiple species carried out to date. Our study shows, for the first time, how neighborhoods of functionally related genes arise and how they are maintained through evolution following a pattern that is fully consistent with the evolutionary trees of the analyzed species. Contrary to what would be expected, such neighborhoods are not composed of the same genes in different species but rather by genes unrelated, annotated, however, with the same function. Our analysis also reveals that such neighborhoods are dynamically rebuilt in a way that, while the particular genes often change, it is the function of the genes present in the neighborhood, as the ultimate target of selection, that is preserved.
Collapse
Affiliation(s)
- Fátima Al-Shahrour
- Department of Bioinformatics and Genomics, Centro de Investigación Príncipe Felipe (CIPF), Valencia, Spain
| | - Pablo Minguez
- Department of Bioinformatics and Genomics, Centro de Investigación Príncipe Felipe (CIPF), Valencia, Spain
| | - Tomás Marqués-Bonet
- Institut de Biologia Evolutiva, Universitat Pompeu Fabra (UPF) and Consejo Superior de Investigaciones Científicas (CSIC), Barcelona, Spain
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
- Howard Hughes Medical Institute, University of Washington, Seattle, Washington, United States of America
| | - Elodie Gazave
- Institut de Biologia Evolutiva, Universitat Pompeu Fabra (UPF) and Consejo Superior de Investigaciones Científicas (CSIC), Barcelona, Spain
| | - Arcadi Navarro
- Institut de Biologia Evolutiva, Universitat Pompeu Fabra (UPF) and Consejo Superior de Investigaciones Científicas (CSIC), Barcelona, Spain
- Population Genomics Node (National Institute for Bioinformatics, INB), Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
| | - Joaquín Dopazo
- Department of Bioinformatics and Genomics, Centro de Investigación Príncipe Felipe (CIPF), Valencia, Spain
- CIBER de Enfermedades Raras (CIBERER), Valencia, Spain
- Functional Genomics Node (National Institute for Bioinformatics, INB), CIPF, Valencia, Spain
- * E-mail:
| |
Collapse
|
8
|
Farré D, Albà MM. Heterogeneous patterns of gene-expression diversification in mammalian gene duplicates. Mol Biol Evol 2009; 27:325-35. [PMID: 19822635 DOI: 10.1093/molbev/msp242] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
Gene duplication is a major mechanism for molecular evolutionary innovation. Young gene duplicates typically exhibit elevated rates of protein evolution and, according to a number of recent studies, increased expression divergence. However, the nature of these changes is still poorly understood. To gain novel insights into the functional consequences of gene duplication, we have undertaken an in-depth analysis of a large data set of gene families containing primate- and/or rodent-specific gene duplicates. We have found a clear tendency toward an increase in protein, promoter, and expression divergence with increasing number of duplication events undergone by each gene since the human-mouse split. In addition, gene duplication is significantly associated with a reduction in expression breadth and intensity. Interestingly, it is possible to identify three main groups regarding the evolution of gene expression following gene duplication. The first group, which comprises around 25% of the families, shows patterns compatible with tissue-expression partitioning. The second and largest group, comprising 33-53% of the families, shows broad expression of one of the gene copies and reduced, overlapping, expression of the other copy or copies. This can be attributed, in most cases, to loss of expression in several tissues of one or more gene copies. Finally, a substantial number of families, 19-35%, maintain a very high level of tissue-expression overlap (>0.8) after tens of millions of years of evolution. These families may have been subject to selection for increased gene dosage.
Collapse
|
9
|
Toll-Riera M, Bosch N, Bellora N, Castelo R, Armengol L, Estivill X, Albà MM. Origin of primate orphan genes: a comparative genomics approach. Mol Biol Evol 2008; 26:603-12. [PMID: 19064677 DOI: 10.1093/molbev/msn281] [Citation(s) in RCA: 182] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Genomes contain a large number of genes that do not have recognizable homologues in other species and that are likely to be involved in important species-specific adaptive processes. The origin of many such "orphan" genes remains unknown. Here we present the first systematic study of the characteristics and mechanisms of formation of primate-specific orphan genes. We determine that codon usage values for most orphan genes fall within the bulk of the codon usage distribution of bona fide human proteins, supporting their current protein-coding annotation. We also show that primate orphan genes display distinctive features in relation to genes of wider phylogenetic distribution: higher tissue specificity, more rapid evolution, and shorter peptide size. We estimate that around 24% are highly divergent members of mammalian protein families. Interestingly, around 53% of the orphan genes contain sequences derived from transposable elements (TEs) and are mostly located in primate-specific genomic regions. This indicates frequent recruitment of TEs as part of novel genes. Finally, we also obtain evidence that a small fraction of primate orphan genes, around 5.5%, might have originated de novo from mammalian noncoding genomic regions.
Collapse
Affiliation(s)
- Macarena Toll-Riera
- Evolutionary Genomics Group, Biomedical Informatics Research Programme, Fundació Institut Municipal d'Investigació Mèdica, Barcelona, Spain
| | | | | | | | | | | | | |
Collapse
|
10
|
Moyrand F, Lafontaine I, Fontaine T, Janbon G. UGE1 and UGE2 regulate the UDP-glucose/UDP-galactose equilibrium in Cryptococcus neoformans. EUKARYOTIC CELL 2008; 7:2069-77. [PMID: 18820075 PMCID: PMC2593187 DOI: 10.1128/ec.00189-08] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/12/2008] [Accepted: 09/19/2008] [Indexed: 11/20/2022]
Abstract
The genome of the basidiomycete pathogenic yeast Cryptococcus neoformans carries two UDP-glucose epimerase genes (UGE1 and UGE2). UGE2 maps within a galactose cluster composed of a galactokinase homologue gene and a galactose-1-phosphate uridylyltransferase. This clustered organization of the GAL genes is similar to that in most of the hemiascomycete yeast genomes and in Schizosaccharomyces pombe but is otherwise not generally conserved in the fungal kingdom. UGE1 has been identified as necessary for galactoxylomannan biosynthesis and virulence. Here, we show that UGE2 is necessary for C. neoformans cells to utilize galactose as a carbon source at 30 degrees C but is not required for virulence. In contrast, deletion of UGE1 does not affect cell growth on galactose at this temperature. At 37 degrees C, a uge2Delta mutant grows on galactose in a UGE1-dependent manner. This compensation by UGE1 of UGE2 mutation for growth on galactose at 37 degrees C was not associated with upregulation of UGE1 transcription or with an increase of the affinity of the enzyme for UDP-galactose at this temperature. We studied the subcellular localization of the two enzymes. Whereas at 30 degrees C, Uge1p is at least partially associated with intracellular vesicles and Uge2p is on the plasma membrane, in cells growing on galactose at 37 degrees C, Uge1p colocalizes with Uge2p to the plasma membrane, suggesting that its activity is regulated through subcellular localization.
Collapse
Affiliation(s)
- Frédérique Moyrand
- Unité de Mycologie Moléculaire, Institut Pasteur, CNRS, URA3012, 75724 Paris Cedex 15, France
| | | | | | | |
Collapse
|
11
|
Marques-Bonet T, Cheng Z, She X, Eichler EE, Navarro A. The genomic distribution of intraspecific and interspecific sequence divergence of human segmental duplications relative to human/chimpanzee chromosomal rearrangements. BMC Genomics 2008; 9:384. [PMID: 18699995 PMCID: PMC2542386 DOI: 10.1186/1471-2164-9-384] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2007] [Accepted: 08/12/2008] [Indexed: 11/20/2022] Open
Abstract
BACKGROUND It has been suggested that chromosomal rearrangements harbor the molecular footprint of the biological phenomena which they induce, in the form, for instance, of changes in the sequence divergence rates of linked genes. So far, all the studies of these potential associations have focused on the relationship between structural changes and the rates of evolution of single-copy DNA and have tried to exclude segmental duplications (SDs). This is paradoxical, since SDs are one of the primary forces driving the evolution of structure and function in our genomes and have been linked not only with novel genes acquiring new functions, but also with overall higher DNA sequence divergence and major chromosomal rearrangements. RESULTS Here we take the opposite view and focus on SDs. We analyze several of the features of SDs, including the rates of intraspecific divergence between paralogous copies of human SDs and of interspecific divergence between human SDs and chimpanzee DNA. We study how divergence measures relate to chromosomal rearrangements, while considering other factors that affect evolutionary rates in single copy DNA. CONCLUSION We find that interspecific SD divergence behaves similarly to divergence of single-copy DNA. In contrast, old and recent paralogous copies of SDs do present different patterns of intraspecific divergence. Also, we show that some relatively recent SDs accumulate in regions that carry inversions in sister lineages.
Collapse
Affiliation(s)
- Tomàs Marques-Bonet
- Unitat de Biologia Evolutiva Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, Barcelona, Catalonia, Spain
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Ze Cheng
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Xinwei She
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Arcadi Navarro
- Unitat de Biologia Evolutiva Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, Barcelona, Catalonia, Spain
- Institucio Catalana de Recerca i Estudis Avancats (ICREA) and Universitat Pompeu Fabra, Barcelona, Catalonia, Spain
- Population Genomics Node (GNV8), National Institute for Bioinformatics (INB) Universitat Pompeu Fabra, Spain
| |
Collapse
|
12
|
Ruiz-Herrera A, Castresana J, Robinson TJ. Is mammalian chromosomal evolution driven by regions of genome fragility? Genome Biol 2006; 7:R115. [PMID: 17156441 PMCID: PMC1794428 DOI: 10.1186/gb-2006-7-12-r115] [Citation(s) in RCA: 110] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2006] [Revised: 11/06/2006] [Accepted: 12/08/2006] [Indexed: 11/22/2022] Open
Abstract
BACKGROUND A fundamental question in comparative genomics concerns the identification of mechanisms that underpin chromosomal change. In an attempt to shed light on the dynamics of mammalian genome evolution, we analyzed the distribution of syntenic blocks, evolutionary breakpoint regions, and evolutionary breakpoints taken from public databases available for seven eutherian species (mouse, rat, cattle, dog, pig, cat, and horse) and the chicken, and examined these for correspondence with human fragile sites and tandem repeats. RESULTS Our results confirm previous investigations that showed the presence of chromosomal regions in the human genome that have been repeatedly used as illustrated by a high breakpoint accumulation in certain chromosomes and chromosomal bands. We show, however, that there is a striking correspondence between fragile site location, the positions of evolutionary breakpoints, and the distribution of tandem repeats throughout the human genome, which similarly reflect a non-uniform pattern of occurrence. CONCLUSION These observations provide further evidence that certain chromosomal regions in the human genome have been repeatedly used in the evolutionary process. As a consequence, the genome is a composite of fragile regions prone to reorganization that have been conserved in different lineages, and genomic tracts that do not exhibit the same levels of evolutionary plasticity.
Collapse
Affiliation(s)
- Aurora Ruiz-Herrera
- Evolutionary Genomics Group, Department of Botany & Zoology, University of Stellenbosch, Private Bag X1, Matieland 7602, South Africa
| | - Jose Castresana
- Institut de Biologia Molecular de Barcelona, CSIC, Department of Physiology and Molecular Biodiversity, Jordi Girona 18, 08034 Barcelona, Spain
| | - Terence J Robinson
- Evolutionary Genomics Group, Department of Botany & Zoology, University of Stellenbosch, Private Bag X1, Matieland 7602, South Africa
| |
Collapse
|
13
|
Marques-Bonet T, Navarro A. Chromosomal rearrangements are associated with higher rates of molecular evolution in mammals. Gene 2005; 353:147-54. [PMID: 15951139 DOI: 10.1016/j.gene.2005.05.007] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2005] [Revised: 04/25/2005] [Accepted: 05/10/2005] [Indexed: 10/25/2022]
Abstract
Evolutionary rates are not uniformly distributed across the genome. Knowledge about the biological causes of this observation is still incomplete, but its exploration has provided valuable insight into the genomical, historical and demographical variables that influence rates of genetic divergence. Recent studies suggest a possible association between chromosomal rearrangements and regions of greater divergence, but evidence is limited and contradictory. Here, we test the hypothesis of a relationship between chromosomal rearrangements and higher rates of molecular evolution by studying the genomic distribution of divergence between 12,000 human-mouse orthologous genes. Our results clearly show that genes located in genomic regions that have been highly rearranged between the two species present higher rates of synonymous (0.7686 vs. 0.7076) and non-synonymous substitution (0.1014 vs. 0.0871), and that synonymous substitution rates are higher in genes close to the breakpoints of individual rearrangements. The many potential causes of such striking are discussed, particularly in the light of speciation models suggesting that chromosomal rearrangements may have contributed to some of the speciation processes along the human and mouse lineages. Still, there are other possible causes and further research is needed to properly explore them.
Collapse
Affiliation(s)
- Tomàs Marques-Bonet
- Unitat de Biologia Evolutiva Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, Doctor Aiguader 80, 08003 Barcelona, Spain
| | | |
Collapse
|