1
|
Potapova NA, Kondrashov AS, Mirkin SM. Characteristics and possible mechanisms of formation of microinversions distinguishing human and chimpanzee genomes. Sci Rep 2022; 12:591. [PMID: 35022450 PMCID: PMC8755829 DOI: 10.1038/s41598-021-04621-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2021] [Accepted: 12/28/2021] [Indexed: 12/02/2022] Open
Abstract
Genomic inversions come in various sizes. While long inversions are relatively easy to identify by aligning high-quality genome sequences, unambiguous identification of microinversions is more problematic. Here, using a set of extra stringent criteria to distinguish microinversions from other mutational events, we describe microinversions that occurred after the divergence of humans and chimpanzees. In total, we found 59 definite microinversions that range from 17 to 33 nucleotides in length. In majority of them, human genome sequences matched exactly the reverse-complemented chimpanzee genome sequences, implying that the inverted DNA segment was copied precisely. All these microinversions were flanked by perfect or nearly perfect inverted repeats pointing to their key role in their formation. Template switching at inverted repeats during DNA replication was previously discussed as a possible mechanism for the microinversion formation. However, many of definite microinversions found by us cannot be easily explained via template switching owing to the combination of the short length and imperfect nature of their flanking inverted repeats. We propose a novel, alternative mechanism that involves repair of a double-stranded break within the inverting segment via microhomology-mediated break-induced replication, which can consistently explain all definite microinversion events.
Collapse
Affiliation(s)
- Nadezhda A Potapova
- Institute for Information Transmission Problems (Kharkevich Institute), Russian Academy of Sciences, Moscow, Russia, 127051.
| | - Alexey S Kondrashov
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Sergei M Mirkin
- Department of Biology, Tufts University, Medford, MA, 02155, USA.
| |
Collapse
|
2
|
Qu L, Wang L, He F, Han Y, Yang L, Wang MD, Zhu H. The Landscape of Micro-Inversions Provide Clues for Population Genetic Analysis of Humans. Interdiscip Sci 2020; 12:499-514. [PMID: 32929667 PMCID: PMC7658078 DOI: 10.1007/s12539-020-00392-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2020] [Revised: 09/02/2020] [Accepted: 09/03/2020] [Indexed: 11/04/2022]
Abstract
BACKGROUND Variations in the human genome have been studied extensively. However, little is known about the role of micro-inversions (MIs), generally defined as small (< 100 bp) inversions, in human evolution, diversity, and health. Depicting the pattern of MIs among diverse populations is critical for interpreting human evolutionary history and obtaining insight into genetic diseases. RESULTS In this paper, we explored the distribution of MIs in genomes from 26 human populations and 7 nonhuman primate genomes and analyzed the phylogenetic structure of the 26 human populations based on the MIs. We further investigated the functions of the MIs located within genes associated with human health. With hg19 as the reference genome, we detected 6968 MIs among the 1937 human samples and 24,476 MIs among the 7 nonhuman primate genomes. The analyses of MIs in human genomes showed that the MIs were rarely located in exonic regions. Nonhuman primates and human populations shared only 82 inverted alleles, and Africans had the most inverted alleles in common with nonhuman primates, which was consistent with the "Out of Africa" hypothesis. The clustering of MIs among the human populations also coincided with human migration history and ancestral lineages. CONCLUSIONS We propose that MIs are potential evolutionary markers for investigating population dynamics. Our results revealed the diversity of MIs in human populations and showed that they are essential to construct human population relationships and have a potential effect on human health.
Collapse
Affiliation(s)
- Li Qu
- State Key Laboratory for Turbulence and Complex Systems and Department of Biomedical Engineering, College of Engineering, Peking University, Beijing, 100871, China
- Wallace H. Coulter Department of Biomedical Engineering, Georgia Tech and Emory University, Atlanta, GA, 30332, USA
| | - Luotong Wang
- State Key Laboratory for Turbulence and Complex Systems and Department of Biomedical Engineering, College of Engineering, Peking University, Beijing, 100871, China
- Center for Quantitative Biology, Peking University, Beijing, 100871, China
| | - Feifei He
- State Key Laboratory for Turbulence and Complex Systems and Department of Biomedical Engineering, College of Engineering, Peking University, Beijing, 100871, China
- Center for Quantitative Biology, Peking University, Beijing, 100871, China
| | - Yilun Han
- State Key Laboratory for Turbulence and Complex Systems and Department of Biomedical Engineering, College of Engineering, Peking University, Beijing, 100871, China
- Center for Quantitative Biology, Peking University, Beijing, 100871, China
| | - Longshu Yang
- State Key Laboratory for Turbulence and Complex Systems and Department of Biomedical Engineering, College of Engineering, Peking University, Beijing, 100871, China
- Center for Quantitative Biology, Peking University, Beijing, 100871, China
| | - May D Wang
- Wallace H. Coulter Department of Biomedical Engineering, Georgia Tech and Emory University, Atlanta, GA, 30332, USA
| | - Huaiqiu Zhu
- State Key Laboratory for Turbulence and Complex Systems and Department of Biomedical Engineering, College of Engineering, Peking University, Beijing, 100871, China.
- Wallace H. Coulter Department of Biomedical Engineering, Georgia Tech and Emory University, Atlanta, GA, 30332, USA.
- Center for Quantitative Biology, Peking University, Beijing, 100871, China.
| |
Collapse
|
3
|
Karageorgiou C, Tarrío R, Rodríguez-Trelles F. The Cyclically Seasonal Drosophila subobscura Inversion O 7 Originated From Fragile Genomic Sites and Relocated Immunity and Metabolic Genes. Front Genet 2020; 11:565836. [PMID: 33193649 PMCID: PMC7584159 DOI: 10.3389/fgene.2020.565836] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2020] [Accepted: 09/09/2020] [Indexed: 11/28/2022] Open
Abstract
Chromosome inversions are important contributors to standing genetic variation in Drosophila subobscura. Presently, the species is experiencing a rapid replacement of high-latitude by low-latitude inversions associated with global warming. Yet not all low-latitude inversions are correlated with the ongoing warming trend. This is particularly unexpected in the case of O7 because it shows a regular seasonal cycle that peaks in summer and rose with a heatwave. The inconsistent behavior of O7 across components of the ambient temperature suggests that is causally more complex than simply due to temperature alone. In order to understand the dynamics of O7, high-quality genomic data are needed to determine both the breakpoints and the genetic content. To fill this gap, here we generated a PacBio long read-based chromosome-scale genome assembly, from a highly homozygous line made isogenic for an O3 + 4 + 7 chromosome. Then we isolated the complete continuous sequence of O7 by conserved synteny analysis with the available reference genome. Main findings include the following: (i) the assembled O7 inversion stretches 9.936 Mb, containing > 1,000 annotated genes; (ii) O7 had a complex origin, involving multiple breaks associated with non-B DNA-forming motifs, formation of a microinversion, and ectopic repair in trans with the two homologous chromosomes; (iii) the O7 breakpoints carry a pre-inversion record of fragility, including a sequence insertion, and transposition with later inverted duplication of an Attacin immunity gene; and (iv) the O7 inversion relocated the major insulin signaling forkhead box subgroup O (foxo) gene in tight linkage with its antagonistic regulatory partner serine/threonine-protein kinase B (Akt1) and disrupted concerted evolution of the two inverted Attacin duplicates, reattaching them to dFOXO metabolic enhancers. Our findings suggest that O7 exerts antagonistic pleiotropic effects on reproduction and immunity, setting a framework to understand its relationship with climate change. Furthermore, they are relevant for fragility in genome rearrangement evolution and for current views on the contribution of breakage versus repair in shaping inversion-breakpoint junctions.
Collapse
Affiliation(s)
- Charikleia Karageorgiou
- Grup de Genòmica, Bioinformàtica i Biologia Evolutiva (GGBE), Departament de Genètica i de Microbiologia, Universitat Autonòma de Barcelona, Barcelona, Spain
| | - Rosa Tarrío
- Grup de Genòmica, Bioinformàtica i Biologia Evolutiva (GGBE), Departament de Genètica i de Microbiologia, Universitat Autonòma de Barcelona, Barcelona, Spain
| | - Francisco Rodríguez-Trelles
- Grup de Genòmica, Bioinformàtica i Biologia Evolutiva (GGBE), Departament de Genètica i de Microbiologia, Universitat Autonòma de Barcelona, Barcelona, Spain
| |
Collapse
|
4
|
Frith MC, Khan S. A survey of localized sequence rearrangements in human DNA. Nucleic Acids Res 2019; 46:1661-1673. [PMID: 29272440 PMCID: PMC5829575 DOI: 10.1093/nar/gkx1266] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2017] [Accepted: 12/07/2017] [Indexed: 01/29/2023] Open
Abstract
Genomes mutate and evolve in ways simple (substitution or deletion of bases) and complex (e.g. chromosome shattering). We do not fully understand what types of complex mutation occur, and we cannot routinely characterize arbitrarily-complex mutations in a high-throughput, genome-wide manner. Long-read DNA sequencing methods (e.g. PacBio, nanopore) are promising for this task, because one read may encompass a whole complex mutation. We describe an analysis pipeline to characterize arbitrarily-complex 'local' mutations, i.e. intrachromosomal mutations encompassed by one DNA read. We apply it to nanopore and PacBio reads from one human cell line (NA12878), and survey sequence rearrangements, both real and artifactual. Almost all the real rearrangements belong to recurring patterns or motifs: the most common is tandem multiplication (e.g. heptuplication), but there are also complex patterns such as localized shattering, which resembles DNA damage by radiation. Gene conversions are identified, including one between hemoglobin gamma genes. This study demonstrates a way to find intricate rearrangements with any number of duplications, deletions, and repositionings. It demonstrates a probability-based method to resolve ambiguous rearrangements involving highly similar sequences, as occurs in gene conversion. We present a catalog of local rearrangements in one human cell line, and show which rearrangement patterns occur.
Collapse
Affiliation(s)
- Martin C Frith
- Artificial Intelligence Research Center, AIST, Tokyo 135-0064, Japan.,Graduate School of Frontier Sciences, University of Tokyo, Chiba 277-8562, Japan.,Computational Bio Big-Data Open Innovation Laboratory (CBBD-OIL), AIST, Tokyo 169-8555, Japan
| | - Sofia Khan
- Computational Bio Big-Data Open Innovation Laboratory (CBBD-OIL), AIST, Tokyo 169-8555, Japan
| |
Collapse
|
5
|
Bastos CAC, Afreixo V, Rodrigues JMOS, Pinho AJ, Silva RM. Distribution of Distances Between Symmetric Words in the Human Genome: Analysis of Regular Peaks. Interdiscip Sci 2019; 11:367-372. [PMID: 30911903 DOI: 10.1007/s12539-019-00326-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2018] [Revised: 01/24/2019] [Accepted: 02/27/2019] [Indexed: 11/29/2022]
Abstract
Finding DNA sites with high potential for the formation of hairpin/cruciform structures is an important task. Previous works studied the distances between adjacent reversed complement words (symmetric word pairs) and also for non-adjacent words. It was observed that for some words a few distances were favoured (peaks) and that in some distributions there was strong peak regularity. The present work extends previous studies, by improving the detection and characterization of peak regularities in the symmetric word pairs distance distributions of the human genome. This work also analyzes the location of the sequences that originate the observed strong peak periodicity in the distance distribution. The results obtained in this work may indicate genomic sites with potential for the formation of hairpin/cruciform structures.
Collapse
Affiliation(s)
- Carlos A C Bastos
- Department of Electronics, Telecommunications and Informatics, IEETA-Institute of Electronics and Informatics Engineering of Aveiro, University of Aveiro, Campus Universitário de Santiago, Aveiro, Portugal.
| | - Vera Afreixo
- Department of Mathematics, IEETA-Institute of Electronics and Informatics Engineering of Aveiro, CIDMA-Center for Research and Development in Mathematics and Applications, University of Aveiro, Campus Universitário de Santiago, Aveiro, Portugal
| | - João M O S Rodrigues
- Department of Electronics, Telecommunications and Informatics, IEETA-Institute of Electronics and Informatics Engineering of Aveiro, University of Aveiro, Campus Universitário de Santiago, Aveiro, Portugal
| | - Armando J Pinho
- Department of Electronics, Telecommunications and Informatics, IEETA-Institute of Electronics and Informatics Engineering of Aveiro, University of Aveiro, Campus Universitário de Santiago, Aveiro, Portugal
| | - Raquel M Silva
- Department of Medical Sciences, iBiMED, IEETA-Institute of Electronics and Informatics Engineering of Aveiro, University of Aveiro, Campus Universitário de Santiago, Aveiro, Portugal
| |
Collapse
|
6
|
Ruiz-Ruano FJ, Navarro-Domínguez B, Camacho JPM, Garrido-Ramos MA. Full plastome sequence of the fern Vandenboschia speciosa (Hymenophyllales): structural singularities and evolutionary insights. JOURNAL OF PLANT RESEARCH 2019; 132:3-17. [PMID: 30552526 DOI: 10.1007/s10265-018-1077-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/09/2018] [Accepted: 11/26/2018] [Indexed: 05/14/2023]
Abstract
We provide here the first full chloroplast genome sequence, i.e., the plastome, for a species belonging to the fern order Hymenophyllales. The phylogenetic position of this order within leptosporangiate ferns, together with the general scarcity of information about fern plastomes, places this research as a valuable study on the analysis of the diversity of plastomes throughout fern evolution. Gene content of V. speciosa plastome was similar to that in most ferns, although there were some characteristic gene losses and lineage-specific differences. In addition, an important number of genes required U to C RNA editing for proper protein translation and two genes showed start codons alternative to the canonical AUG (AUA). Concerning gene order, V. speciosa shared the specific 30-kb inversion of euphyllophytes plastomes and the 3.3-kb inversion of fern plastomes, keeping the ancestral gene order shared by eusporangiate and early leptosporangiate ferns. Conversely, V. speciosa has expanded IR regions comprising the rps7, rps12, ndhB and trnL genes in addition to rRNA and other tRNA genes, a condition shared with several eusporangiate ferns, lycophytes and hornworts, as well as most seed plants.
Collapse
Affiliation(s)
- F J Ruiz-Ruano
- Departamento de Genética, Facultad de Ciencias, Universidad de Granada, Granada, Spain
| | - B Navarro-Domínguez
- Departamento de Genética, Facultad de Ciencias, Universidad de Granada, Granada, Spain
| | - J P M Camacho
- Departamento de Genética, Facultad de Ciencias, Universidad de Granada, Granada, Spain
| | | |
Collapse
|
7
|
Olson D, Wheeler T. ULTRA: A Model Based Tool to Detect Tandem Repeats. ACM-BCB ... ... : THE ... ACM CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND BIOMEDICINE. ACM CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND BIOMEDICINE 2018; 2018:37-46. [PMID: 31080962 DOI: 10.1145/3233547.3233604] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
In biological sequences, tandem repeats consist of tens to hundreds of residues of a repeated pattern, such as atgatgatgatgatg ('atg' repeated), often the result of replication slippage. Over time, these repeats decay so that the original sharp pattern of repetition is somewhat obscured, but even degenerate repeats pose a problem for sequence annotation: when two sequences both contain shared patterns of similar repetition, the result can be a false signal of sequence homology. We describe an implementation of a new hidden Markov model for detecting tandem repeats that shows substantially improved sensitivity to labeling decayed repetitive regions, presents low and reliable false annotation rates across a wide range of sequence composition, and produces scores that follow a stable distribution. On typical genomic sequence, the time and memory requirements of the resulting tool (ULTRA) are competitive with the most heavily used tool for repeat masking (TRF). ULTRA is released under an open source license and lays the groundwork for inclusion of the model in sequence alignment tools and annotation pipelines.
Collapse
|
8
|
Fresco JR, Amosova O. Site-Specific Self-Catalyzed DNA Depurination: A Biological Mechanism That Leads to Mutations and Creates Sequence Diversity. Annu Rev Biochem 2017; 86:461-484. [PMID: 28654322 DOI: 10.1146/annurev-biochem-070611-095951] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Self-catalyzed DNA depurination is a sequence-specific physiological mechanism mediated by spontaneous extrusion of a stem-loop catalytic intermediate. Hydrolysis of the 5'G residue of the 5'GA/TGG loop and of the first 5'A residue of the 5'GAGA loop, together with particular first stem base pairs, specifies their hydrolysis without involving protein, cofactor, or cation. As such, this mechanism is the only known DNA catalytic activity exploited by nature. The consensus sequences for self-depurination of such G- and A-loop residues occur in all genomes examined across the phyla, averaging one site every 2,000-4,000 base pairs. Because apurinic sites are subject to error-prone repair, leading to substitution and short frameshift mutations, they are both a source of genome damage and a means for creating sequence diversity. Their marked overrepresentation in genomes, and largely unchanging density from the lowest to the highest organisms, indicate their selection over the course of evolution. The mutagenicity at such sites in many human genes is associated with loss of function of key proteins responsible for diverse diseases.
Collapse
Affiliation(s)
- Jacques R Fresco
- Department of Molecular Biology, Princeton University, Princeton, New Jersey 08544; ,
| | - Olga Amosova
- Department of Molecular Biology, Princeton University, Princeton, New Jersey 08544; ,
| |
Collapse
|
9
|
Tavares AHMP, Pinho AJ, Silva RM, Rodrigues JMOS, Bastos CAC, Ferreira PJSG, Afreixo V. DNA word analysis based on the distribution of the distances between symmetric words. Sci Rep 2017; 7:728. [PMID: 28389642 PMCID: PMC5428789 DOI: 10.1038/s41598-017-00646-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2016] [Accepted: 03/02/2017] [Indexed: 02/01/2023] Open
Abstract
We address the problem of discovering pairs of symmetric genomic words (i.e., words and the corresponding reversed complements) occurring at distances that are overrepresented. For this purpose, we developed new procedures to identify symmetric word pairs with uncommon empirical distance distribution and with clusters of overrepresented short distances. We speculate that patterns of overrepresentation of short distances between symmetric word pairs may allow the occurrence of non-standard DNA conformations, such as hairpin/cruciform structures. We focused on the human genome, and analysed both the complete genome as well as a version with known repetitive sequences masked out. We reported several well-defined features in the distributions of distances, which can be classified into three different profiles, showing enrichment in distinct distance ranges. We analysed in greater detail certain pairs of symmetric words of length seven, found by our procedure, characterised by the surprising fact that they occur at single distances more frequently than expected.
Collapse
Affiliation(s)
- Ana H M P Tavares
- Department of Mathematics & CIDMA, University of Aveiro, Aveiro, Portugal.,Department of Medical Sciences & iBiMED, University of Aveiro, Aveiro, Portugal
| | - Armando J Pinho
- Department of Electronics, Telecommunications and Informatics, University of Aveiro, Aveiro, Portugal.,IEETA, University of Aveiro, Aveiro, Portugal
| | - Raquel M Silva
- Department of Medical Sciences & iBiMED, University of Aveiro, Aveiro, Portugal.,IEETA, University of Aveiro, Aveiro, Portugal
| | - João M O S Rodrigues
- Department of Electronics, Telecommunications and Informatics, University of Aveiro, Aveiro, Portugal.,IEETA, University of Aveiro, Aveiro, Portugal
| | - Carlos A C Bastos
- Department of Electronics, Telecommunications and Informatics, University of Aveiro, Aveiro, Portugal.,IEETA, University of Aveiro, Aveiro, Portugal
| | - Paulo J S G Ferreira
- Department of Electronics, Telecommunications and Informatics, University of Aveiro, Aveiro, Portugal.,IEETA, University of Aveiro, Aveiro, Portugal
| | - Vera Afreixo
- Department of Mathematics & CIDMA, University of Aveiro, Aveiro, Portugal. .,Department of Medical Sciences & iBiMED, University of Aveiro, Aveiro, Portugal. .,IEETA, University of Aveiro, Aveiro, Portugal.
| |
Collapse
|
10
|
LINE-1 retrotransposons: from 'parasite' sequences to functional elements. J Appl Genet 2014; 56:133-45. [PMID: 25106509 DOI: 10.1007/s13353-014-0241-x] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2014] [Revised: 07/24/2014] [Accepted: 07/25/2014] [Indexed: 10/24/2022]
Abstract
Long interspersed nuclear elements-1 (LINE-1) are the most abundant and active retrotransposons in the mammalian genomes. Traditionally, the occurrence of LINE-1 sequences in the genome of mammals has been explained by the selfish DNA hypothesis. Nevertheless, recently, it has also been argued that these sequences could play important roles in these genomes, as in the regulation of gene expression, genome modelling and X-chromosome inactivation. The non-random chromosomal distribution is a striking feature of these retroelements that somehow reflects its functionality. In the present study, we have isolated and analysed a fraction of the open reading frame 2 (ORF2) LINE-1 sequence from three rodent species, Cricetus cricetus, Peromyscus eremicus and Praomys tullbergi. Physical mapping of the isolated sequences revealed an interspersed longitudinal AT pattern of distribution along all the chromosomes of the complement in the three genomes. A detailed analysis shows that these sequences are preferentially located in the euchromatic regions, although some signals could be detected in the heterochromatin. In addition, a coincidence between the location of imprinted gene regions (as Xist and Tsix gene regions) and the LINE-1 retroelements was also observed. According to these results, we propose an involvement of LINE-1 sequences in different genomic events as gene imprinting, X-chromosome inactivation and evolution of repetitive sequences located at the heterochromatic regions (e.g. satellite DNA sequences) of the rodents' genomes analysed.
Collapse
|
11
|
A high resolution map of mammalian X chromosome fragile regions assessed by large-scale comparative genomics. Mamm Genome 2014; 25:618-35. [PMID: 25086724 DOI: 10.1007/s00335-014-9537-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2014] [Accepted: 07/14/2014] [Indexed: 10/24/2022]
Abstract
Chromosomal evolution involves multiple changes at structural and numerical levels. These changes, which are related to the variation of the gene number and their location, can be tracked by the identification of syntenic blocks (SB). First reports proposed that ~180-280 SB might be shared by mouse and human species. More recently, further studies including additional genomes have identified up to ~1,400 SB during the evolution of eutherian species. A considerable number of studies regarding the X chromosome's structure and evolution have been undertaken because of its extraordinary biological impact on reproductive fitness and speciation. Some have identified evolutionary breakpoint regions and fragile sites at specific locations in the human X chromosome. However, mapping these regions to date has involved using low-to-moderate resolution techniques. Such scenario might be related to underestimating their total number and giving an inaccurate location. The present study included using a combination of bioinformatics methods for identifying, at base-pair level, chromosomal rearrangements occurring during X chromosome evolution in 13 mammalian species. A comparative technique using four different algorithms was used for optimizing the detection of hotspot regions in the human X chromosome. We identified a significant interspecific variation in SB size which was related to genetic information gain regarding the human X chromosome. We found that human hotspot regions were enriched by LINE-1 and Alu transposable elements, which may have led to intraspecific chromosome rearrangement events. New fragile regions located in the human X chromosome have also been postulated. We estimate that the high resolution map of X chromosome fragile sites presented here constitutes useful data concerning future studies on mammalian evolution and human disease.
Collapse
|
12
|
Calvete O, González J, Betrán E, Ruiz A. Segmental duplication, microinversion, and gene loss associated with a complex inversion breakpoint region in Drosophila. Mol Biol Evol 2012; 29:1875-89. [PMID: 22328714 DOI: 10.1093/molbev/mss067] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
Chromosomal inversions are usually portrayed as simple two-breakpoint rearrangements changing gene order but not gene number or structure. However, increasing evidence suggests that inversion breakpoints may often have a complex structure and entail gene duplications with potential functional consequences. Here, we used a combination of different techniques to investigate the breakpoint structure and the functional consequences of a complex rearrangement fixed in Drosophila buzzatii and comprising two tandemly arranged inversions sharing the middle breakpoint: 2m and 2n. By comparing the sequence in the breakpoint regions between D. buzzatii (inverted chromosome) and D. mojavensis (noninverted chromosome), we corroborate the breakpoint reuse at the molecular level and infer that inversion 2m was associated with a duplication of a ~13 kb segment and likely generated by staggered breaks plus repair by nonhomologous end joining. The duplicated segment contained the gene CG4673, involved in nuclear transport, and its two nested genes CG5071 and CG5079. Interestingly, we found that other than the inversion and the associated duplication, both breakpoints suffered additional rearrangements, that is, the proximal breakpoint experienced a microinversion event associated at both ends with a 121-bp long duplication that contains a promoter. As a consequence of all these different rearrangements, CG5079 has been lost from the genome, CG5071 is now a single copy nonnested gene, and CG4673 has a transcript ~9 kb shorter and seems to have acquired a more complex gene regulation. Our results illustrate the complex effects of chromosomal rearrangements and highlight the need of complementing genomic approaches with detailed sequence-level and functional analyses of breakpoint regions if we are to fully understand genome structure, function, and evolutionary dynamics.
Collapse
Affiliation(s)
- Oriol Calvete
- Departament de Genètica i de Microbiologia, Facultat de Biociències, Universitat Autònoma de Barcelona, Bellaterra, Barcelona, Spain
| | | | | | | |
Collapse
|
13
|
Farré M, Bosch M, López-Giráldez F, Ponsà M, Ruiz-Herrera A. Assessing the role of tandem repeats in shaping the genomic architecture of great apes. PLoS One 2011; 6:e27239. [PMID: 22076140 PMCID: PMC3208591 DOI: 10.1371/journal.pone.0027239] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2011] [Accepted: 10/12/2011] [Indexed: 11/18/2022] Open
Abstract
Background Ancestral reconstructions of mammalian genomes have revealed that evolutionary breakpoint regions are clustered in regions that are more prone to break and reorganize. What is still unclear to evolutionary biologists is whether these regions are physically unstable due solely to sequence composition and/or genome organization, or do they represent genomic areas where the selection against breakpoints is minimal. Methodology and Principal Findings Here we present a comprehensive study of the distribution of tandem repeats in great apes. We analyzed the distribution of tandem repeats in relation to the localization of evolutionary breakpoint regions in the human, chimpanzee, orangutan and macaque genomes. We observed an accumulation of tandem repeats in the genomic regions implicated in chromosomal reorganizations. In the case of the human genome our analyses revealed that evolutionary breakpoint regions contained more base pairs implicated in tandem repeats compared to synteny blocks, being the AAAT motif the most frequently involved in evolutionary regions. We found that those AAAT repeats located in evolutionary regions were preferentially associated with Alu elements. Significance Our observations provide evidence for the role of tandem repeats in shaping mammalian genome architecture. We hypothesize that an accumulation of specific tandem repeats in evolutionary regions can promote genome instability by altering the state of the chromatin conformation or by promoting the insertion of transposable elements.
Collapse
Affiliation(s)
- Marta Farré
- Departament de Biologia Cel·lular, Fisiologia i Immunologia, Universitat Autònoma de Barcelona, Cerdanyola del Vallès, Spain
| | | | - Francesc López-Giráldez
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, Connecticut, United States of America
| | - Montserrat Ponsà
- Departament de Biologia Cel·lular, Fisiologia i Immunologia, Universitat Autònoma de Barcelona, Cerdanyola del Vallès, Spain
| | - Aurora Ruiz-Herrera
- Departament de Biologia Cel·lular, Fisiologia i Immunologia, Universitat Autònoma de Barcelona, Cerdanyola del Vallès, Spain
- Institut de Biotecnologia i Biomedicina (IBB), Universitat Autònoma de Barcelona, Cerdanyola del Vallès, Spain
- * E-mail:
| |
Collapse
|
14
|
Amosova O, Kumar V, Deutsch A, Fresco JR. Self-catalyzed site-specific depurination of G residues mediated by cruciform extrusion in closed circular DNA plasmids. J Biol Chem 2011; 286:36322-30. [PMID: 21868375 PMCID: PMC3196133 DOI: 10.1074/jbc.m111.272112] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2011] [Revised: 08/23/2011] [Indexed: 11/06/2022] Open
Abstract
A major variety of "spontaneous" genomic damage is endogenous generation of apurinic sites. Depurination rates vary widely across genomes, occurring with higher frequency at "depurination hot spots." Recently, we discovered a site-specific self-catalyzed depurinating activity in short (14-18 nucleotides) DNA stem-loop-forming sequences with a 5'-G(T/A)GG-3' loop and T·A or G·C as the first base pair at the base of the loop; the 5'-G residue of the loop self-depurinates at least 10(5)-fold faster than random "spontaneous" depurination at pH 5. Formation of the catalytic intermediate for self-depurination in double-stranded DNA requires a stem-loop to extrude as part of a cruciform. In this study, evidence is presented for self-catalyzed depurination mediated by cruciform formation in plasmid DNA in vitro. Cruciform extrusion was confirmed, and its extent was quantitated by digestion of the plasmid with single strand-specific mung bean endonuclease, followed by restriction digestion and sequencing of resulting mung bean-generated fragments. Appearance of the apurinic site in the self-depurinating stem-loop was confirmed by digestion of plasmid DNA with apurinic endonuclease IV, followed by primer extension and/or PCR amplification to detect the endonuclease-generated strand break and identify its location. Self-catalyzed depurination was contingent on the plasmid being supercoiled and was not observed in linearized plasmids, consistent with the presence of the extruded cruciform in the supercoiled plasmid and not in the linear one. These results indicate that self-catalyzed depurination is not unique to single-stranded DNA; rather, it can occur in stem-loop structures extruding from double-stranded DNA and therefore could, in principle, occur in vivo.
Collapse
Affiliation(s)
- Olga Amosova
- From the Department of Molecular Biology, Princeton University, Princeton, New Jersey 08544
| | - Veena Kumar
- From the Department of Molecular Biology, Princeton University, Princeton, New Jersey 08544
| | - Aaron Deutsch
- From the Department of Molecular Biology, Princeton University, Princeton, New Jersey 08544
| | - Jacques R. Fresco
- From the Department of Molecular Biology, Princeton University, Princeton, New Jersey 08544
| |
Collapse
|
15
|
Hara Y, Imanishi T. Abundance of ultramicro inversions within local alignments between human and chimpanzee genomes. BMC Evol Biol 2011; 11:308. [PMID: 22011259 PMCID: PMC3227671 DOI: 10.1186/1471-2148-11-308] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2011] [Accepted: 10/19/2011] [Indexed: 11/18/2022] Open
Abstract
Background Chromosomal inversion is one of the most important mechanisms of evolution. Recent studies of comparative genomics have revealed that chromosomal inversions are abundant in the human genome. While such previously characterized inversions are large enough to be identified as a single alignment or a string of local alignments, the impact of ultramicro inversions, which are such short that the local alignments completely cover them, on evolution is still uncertain. Results In this study, we developed a method for identifying ultramicro inversions by scanning of local alignments. This technique achieved a high sensitivity and a very low rate of false positives. We identified 2,377 ultramicro inversions ranging from five to 125 bp within the orthologous alignments between the human and chimpanzee genomes. The false positive rate was estimated to be around 4%. Based on phylogenetic profiles using the primate outgroups, 479 ultramicro inversions were inferred to have specifically inverted in the human lineage. Ultramicro inversions exclusively involving adenine and thymine were the most frequent; 461 inversions (19.4%) of the total. Furthermore, the density of ultramicro inversions in chromosome Y and the neighborhoods of transposable elements was higher than average. Sixty-five ultramicro inversions were identified within the exons of human protein-coding genes. Conclusions We defined ultramicro inversions as the inverted regions equal to or smaller than 125 bp buried within local alignments. Our observations suggest that ultramicro inversions are abundant among the human and chimpanzee genomes, and that location of the inversions correlated with the genome structural instability. Some of the ultramicro inversions may contribute to gene evolution. Our inversion-identification method is also applicable in the fine-tuning of genome alignments by distinguishing ultramicro inversions from nucleotide substitutions and indels.
Collapse
Affiliation(s)
- Yuichiro Hara
- Biomedicinal Information Research Center, National Institute of Advanced Industrial Science and Technology, Aomi 2-4-7, Koto-ku, Tokyo, Japan
| | | |
Collapse
|
16
|
Hou M, Yao P, Antonou A, Johns MA. Pico-inplace-inversions between human and chimpanzee. Bioinformatics 2011; 27:3266-75. [PMID: 21994225 DOI: 10.1093/bioinformatics/btr566] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION There have been several studies on the micro-inversions between human and chimpanzee, but there are large discrepancies among their results. Furthermore, all of them rely on alignment procedures or existing alignment results to identify inversions. However, the core alignment procedures do not take very small inversions into consideration. Therefore, their analyses cannot find inversions that are too small to be detected by a classic aligner. We call such inversions pico-inversions. RESULTS We re-analyzed human-chimpanzee alignment from the UCSC Genome Browser for micro-inplace-inversions and screened for pico-inplace-inversions using a likelihood ratio test. We report that the quantity of inplace-inversions between human and chimpanzee is substantially greater than what had previously been discovered. We also present the software tool PicoInversionMiner to detect pico-inplace-inversions between closely related species. AVAILABILITY Software tools, scripts and result data are available at http://faculty.cs.niu.edu/~hou/PicoInversion.html. CONTACT mhou@cs.niu.edu.
Collapse
Affiliation(s)
- Minmei Hou
- Department of Computer Science, Northern Illinois University, DeKalb, IL 60115, USA.
| | | | | | | |
Collapse
|
17
|
Bacolla A, Wang G, Jain A, Chuzhanova NA, Cer RZ, Collins JR, Cooper DN, Bohr VA, Vasquez KM. Non-B DNA-forming sequences and WRN deficiency independently increase the frequency of base substitution in human cells. J Biol Chem 2011; 286:10017-26. [PMID: 21285356 PMCID: PMC3060453 DOI: 10.1074/jbc.m110.176636] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2010] [Revised: 01/31/2011] [Indexed: 01/01/2023] Open
Abstract
Although alternative DNA secondary structures (non-B DNA) can induce genomic rearrangements, their associated mutational spectra remain largely unknown. The helicase activity of WRN, which is absent in the human progeroid Werner syndrome, is thought to counteract this genomic instability. We determined non-B DNA-induced mutation frequencies and spectra in human U2OS osteosarcoma cells and assessed the role of WRN in isogenic knockdown (WRN-KD) cells using a supF gene mutation reporter system flanked by triplex- or Z-DNA-forming sequences. Although both non-B DNA and WRN-KD served to increase the mutation frequency, the increase afforded by WRN-KD was independent of DNA structure despite the fact that purified WRN helicase was found to resolve these structures in vitro. In U2OS cells, ∼70% of mutations comprised single-base substitutions, mostly at G·C base-pairs, with the remaining ∼30% being microdeletions. The number of mutations at G·C base-pairs in the context of NGNN/NNCN sequences correlated well with predicted free energies of base stacking and ionization potentials, suggesting a possible origin via oxidation reactions involving electron loss and subsequent electron transfer (hole migration) between neighboring bases. A set of ∼40,000 somatic mutations at G·C base pairs identified in a lung cancer genome exhibited similar correlations, implying that hole migration may also be involved. We conclude that alternative DNA conformations, WRN deficiency and lung tumorigenesis may all serve to increase the mutation rate by promoting, through diverse pathways, oxidation reactions that perturb the electron orbitals of neighboring bases. It follows that such "hole migration" is likely to play a much more widespread role in mutagenesis than previously anticipated.
Collapse
Affiliation(s)
- Albino Bacolla
- From the Department of Molecular Carcinogenesis, Science Park-Research Division, The University of Texas, M. D. Anderson Cancer Center, Smithville, Texas 78957
| | - Guliang Wang
- From the Department of Molecular Carcinogenesis, Science Park-Research Division, The University of Texas, M. D. Anderson Cancer Center, Smithville, Texas 78957
| | - Aklank Jain
- From the Department of Molecular Carcinogenesis, Science Park-Research Division, The University of Texas, M. D. Anderson Cancer Center, Smithville, Texas 78957
| | - Nadia A. Chuzhanova
- the School of Science and Technology, Nottingham Trent University, Nottingham, NG11 8NS, United Kingdom
| | - Regina Z. Cer
- the Advanced Biomedical Computing Center, SAIC-Frederick, Inc., NCI-Frederick, Frederick, Maryland 21702
| | - Jack R. Collins
- the Advanced Biomedical Computing Center, SAIC-Frederick, Inc., NCI-Frederick, Frederick, Maryland 21702
| | - David N. Cooper
- the Institute of Medical Genetics, School of Medicine, Cardiff University, Cardiff, CF14 4XN, United Kingdom, and
| | - Vilhelm A. Bohr
- the Laboratory of Molecular Gerontology, National Institute on Aging, National Institutes of Health, Baltimore, Maryland 21224
| | - Karen M. Vasquez
- From the Department of Molecular Carcinogenesis, Science Park-Research Division, The University of Texas, M. D. Anderson Cancer Center, Smithville, Texas 78957
| |
Collapse
|
18
|
Jain A, Bacolla A, Chakraborty P, Grosse F, Vasquez KM. Human DHX9 helicase unwinds triple-helical DNA structures. Biochemistry 2010; 49:6992-9. [PMID: 20669935 DOI: 10.1021/bi100795m] [Citation(s) in RCA: 86] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
Naturally occurring poly(purine.pyrimidine) rich regions in the human genome are prone to adopting non-canonical DNA structures such as intramolecular triplexes (i.e., H-DNA). Such structure-forming sequences are abundant and can regulate the expression of several disease-linked genes. In addition, the use of triplex-forming oligonucleotides (TFOs) to modulate gene structure and function has potential as an approach to targeted gene therapy. Previously, we found that endogenous H-DNA structures can induce DNA double-strand breaks and promote genomic rearrangements. Herein, we find that the DHX9 helicase co-immunoprecipitates with triplex DNA structures in mammalian cells, suggesting a role in the maintenance of genome stability. We tested this postulate by assessing the helicase activity of purified human DHX9 on various duplex and triplex DNA substrates in vitro. DHX9 displaced the third strand from a specific triplex DNA structure and catalyzed the unwinding with a 3' --> 5' polarity with respect to the displaced third strand. Helicase activity required a 3'-single-stranded overhang on the third strand and was dependent on ATP hydrolysis. The reaction kinetics consisted of a pre-steady-state burst phase followed by a linear, steady-state pseudo-zero-order reaction. In contrast, very little if any helicase activity was detected on blunt triplexes, triplexes with 5'-overhangs, blunt duplexes, duplexes with overhangs, or forked duplex substrates. Thus, triplex structures containing a 3'-overhang represent preferred substrates for DHX9, where it removes the strand with Hoogsteen hydrogen-bonded bases. Our results suggest the involvement of DHX9 in maintaining genome integrity by unwinding mutagenic triplex DNA structures.
Collapse
Affiliation(s)
- Aklank Jain
- Department of Carcinogenesis, Science Park-Research Division, The University of Texas M. D. Anderson Cancer Center, Smithville, Texas 78957, USA
| | | | | | | | | |
Collapse
|
19
|
Evolution in health and medicine Sackler colloquium: Genomic disorders: a window into human gene and genome evolution. Proc Natl Acad Sci U S A 2010; 107 Suppl 1:1765-71. [PMID: 20080665 DOI: 10.1073/pnas.0906222107] [Citation(s) in RCA: 52] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Gene duplications alter the genetic constitution of organisms and can be a driving force of molecular evolution in humans and the great apes. In this context, the study of genomic disorders has uncovered the essential role played by the genomic architecture, especially low copy repeats (LCRs) or segmental duplications (SDs). In fact, regardless of the mechanism, LCRs can mediate or stimulate rearrangements, inciting genomic instability and generating dynamic and unstable regions prone to rapid molecular evolution. In humans, copy-number variation (CNV) has been implicated in common traits such as neuropathy, hypertension, color blindness, infertility, and behavioral traits including autism and schizophrenia, as well as disease susceptibility to HIV, lupus nephritis, and psoriasis among many other clinical phenotypes. The same mechanisms implicated in the origin of genomic disorders may also play a role in the emergence of segmental duplications and the evolution of new genes by means of genomic and gene duplication and triplication, exon shuffling, exon accretion, and fusion/fission events.
Collapse
|
20
|
Zhao J, Bacolla A, Wang G, Vasquez KM. Non-B DNA structure-induced genetic instability and evolution. Cell Mol Life Sci 2010; 67:43-62. [PMID: 19727556 PMCID: PMC3017512 DOI: 10.1007/s00018-009-0131-2] [Citation(s) in RCA: 325] [Impact Index Per Article: 21.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2009] [Revised: 07/22/2009] [Accepted: 08/11/2009] [Indexed: 11/26/2022]
Abstract
Repetitive DNA motifs are abundant in the genomes of various species and have the capacity to adopt non-canonical (i.e., non-B) DNA structures. Several non-B DNA structures, including cruciforms, slipped structures, triplexes, G-quadruplexes, and Z-DNA, have been shown to cause mutations, such as deletions, expansions, and translocations in both prokaryotes and eukaryotes. Their distributions in genomes are not random and often co-localize with sites of chromosomal breakage associated with genetic diseases. Current genome-wide sequence analyses suggest that the genomic instabilities induced by non-B DNA structure-forming sequences not only result in predisposition to disease, but also contribute to rapid evolutionary changes, particularly in genes associated with development and regulatory functions. In this review, we describe the occurrence of non-B DNA-forming sequences in various species, the classes of genes enriched in non-B DNA-forming sequences, and recent mechanistic studies on DNA structure-induced genomic instability to highlight their importance in genomes.
Collapse
Affiliation(s)
- Junhua Zhao
- Department of Carcinogenesis, Science Park-Research Division, The University of Texas M.D. Anderson Cancer Center, 1808 Park Road 1-C, P.O. Box 389, Smithville, TX 78957 USA
| | - Albino Bacolla
- Department of Carcinogenesis, Science Park-Research Division, The University of Texas M.D. Anderson Cancer Center, 1808 Park Road 1-C, P.O. Box 389, Smithville, TX 78957 USA
| | - Guliang Wang
- Department of Carcinogenesis, Science Park-Research Division, The University of Texas M.D. Anderson Cancer Center, 1808 Park Road 1-C, P.O. Box 389, Smithville, TX 78957 USA
| | - Karen M. Vasquez
- Department of Carcinogenesis, Science Park-Research Division, The University of Texas M.D. Anderson Cancer Center, 1808 Park Road 1-C, P.O. Box 389, Smithville, TX 78957 USA
| |
Collapse
|